Ask Your Question

Revision history [back]

click to hide/show revision 1
initial version

there are a couple of problems with TrainData::loadFromCSV here:

  • it does not handle multi-word strings correctly ("AGENCY FB")
  • you cannot selectively choose columns (or throw away unwanted ones)

given, you manage to replace AGENCY FB to AGENCY_FB (or similar items) globally, you could use it like:

Ptr<ml::TrainData> td = ml::TrainData::loadFromCSV("uci.csv",
    1, // 1 header line
    2, // m_label is 3rd item
    -1 // only one label
);

Mat pixels = td->getSamples();
Mat labels = td->getResponses();
cout << pixels.size() << " " << pixels.type() << endl;
cout << labels.size() << " " << labels.type() << endl;

cout << labels << endl;
// get rid of the 1st 11 columns (0,1, 3,4,5,6,7,8,9,10)
cout << pixels(Range::all(), Range(11,411)) << endl;

there are a couple of problems with TrainData::loadFromCSV here:

  • it does not handle multi-word strings correctly ("AGENCY FB")
  • you cannot selectively choose columns (or throw away unwanted ones)

given, you manage to replace AGENCY FB to AGENCY_FB (or similar items) globally, you could use it like:

Ptr<ml::TrainData> td = ml::TrainData::loadFromCSV("uci.csv",
    1, // 1 header line
    2, // m_label is 3rd item
    -1 // only one label
);

Mat pixels = td->getSamples();
Mat labels = td->getResponses();
cout << pixels.size() << " " << pixels.type() << endl;
cout << labels.size() << " " << labels.type() << endl;

cout << labels << endl;
// get rid of the 1st 11 columns (0,1, 3,4,5,6,7,8,9,10)
cout << pixels(Range::all(), Range(11,411)) << endl;


[411 x 2] 5
[1 x 2] 5
[64258;
 64257]
[1, 1, 1, ... 255, 255, 255;
 1, 1, 1, ... 255, 255, 255]

there are a couple of problems with TrainData::loadFromCSV here:

  • it does not handle multi-word strings correctly ("AGENCY FB")
  • you cannot selectively choose columns (or throw away unwanted ones)

given, you manage to replace AGENCY FB to AGENCY_FB (or similar items) globally, you could use it like:

Ptr<ml::TrainData> td = ml::TrainData::loadFromCSV("uci.csv",
    1, // 1 header line
    2, // m_label is 3rd item
    -1 // only one label
);

Mat pixels = td->getSamples();
Mat labels = td->getResponses();
cout << pixels.size() << " " << pixels.type() << endl;
cout << labels.size() << " " << labels.type() << endl;

cout << labels << endl;
// get rid of the 1st 11 columns (0,1, 3,4,5,6,7,8,9,10)
3,4,5,6,7,8,9,10,11)
cout << pixels(Range::all(), Range(11,411)) << endl;


[411 x 2] 5
[1 x 2] 5
[64258;
 64257]
[1, 1, 1, ... 255, 255, 255;
 1, 1, 1, ... 255, 255, 255]