Ask Your Question
0

Trivial random forest with OpenCV doesn't work and isn't the same as sklearn

asked 2014-02-17 12:54:24 -0600

kirilligum gravatar image

updated 2014-02-19 12:49:24 -0600

I'm trying to get the simplest example of random forest to work. The training data is 2 points {0,0} with a label 0 and {1,1} with a label 1. The sample to predict is {2,2}. OpenCV returns 0 rather than 1. Here is the OpenCV code in C++ (main.cpp):

#include <iostream>
#include <opencv2/core/core.hpp>
#include <opencv2/ml/ml.hpp>

using namespace std;
using namespace cv;

int main(int argc, char const *argv[]) {
  cout << " hi \n";
  float trainingData[2][2] = { {0.0, 0.0}, {1.0, 1.0}};
  Mat training_data(2, 2, CV_32FC1, trainingData);
  float trainingClass[2] = {0.0,1.0};
  Mat training_class(2, 1, CV_32FC1, trainingClass);
  CvRTrees rtree;
  rtree.train(training_data, CV_ROW_SAMPLE, training_class);
  float sampleData[2] = {2.0, 2.0};
  Mat sample_data(2, 1, CV_32FC1, sampleData);
  cout << rtree.predict(sample_data) << "  <-- predict\n";
  return 0;
}

cmake file:

cmake_minimum_required(VERSION 2.8)
project( main )
find_package( OpenCV REQUIRED )
add_executable( main main.cpp )
target_link_libraries( main ${OpenCV_LIBS} )

running:

> cmake .;make;./main
 hi 
0  <-- predict

To compare, here is a python's sklearn code (rfc.py):

from sklearn.ensemble import RandomForestClassifier
X = [[0, 0], [1, 1]]
Y = [0, 1]
clf = RandomForestClassifier(n_estimators=10)
clf = clf.fit(X, Y)
print clf.predict([[2., 2.]])

running:

> python rfc.py 
[1]

Update (1)

I tried different combination of placement of data and also changed the line

Mat sample_data(2, 1, CV_32FC1, sampleData);

to

Mat sample_data(2, 1, CV_32FC1, sampleData);

I still get just 0.

edit retag flag offensive close merge delete

1 answer

Sort by ยป oldest newest most voted
0

answered 2014-02-17 16:32:27 -0600

Nghia gravatar image

Try CV_ROW_SAMPLE instead of CV_COL_SAMPLE.

edit flag offensive delete link more

Comments

I tried that initially. I forgot to change it back. It should be ROW. I'll edit the question. thanks.

kirilligum gravatar imagekirilligum ( 2014-02-17 17:35:50 -0600 )edit
1

Since you only got 2 samples. You'll probably need to edit the default parameters to make it work. Try CvDTreeParams param; param.min_sample_count = 1. Pass param to your train() call.

Nghia gravatar imageNghia ( 2014-02-19 07:29:08 -0600 )edit

i increased the training data to 3 points and it worked. might be better to put the comment as an answer.

kirilligum gravatar imagekirilligum ( 2014-02-19 15:33:25 -0600 )edit

Question Tools

Stats

Asked: 2014-02-17 12:54:24 -0600

Seen: 839 times

Last updated: Feb 19 '14