opencv_traincascade with millions of negative samples

asked 2013-08-30 09:18:34 -0500

Ramiro gravatar image

updated 2013-08-30 23:28:35 -0500

Has anyone here used the opencv_traincascade application to train a Haar wavelet based cascade of classifiers using tens or hundreds of millions of negative samples? Are there any report in the community about a "serious" usage of that application? If yes, then:

  1. Did you compile the application with any particular set of flags?
  2. Did you implement any significant change in the application yourself?
  3. Did you run it under which OS?
  4. How long it took to produce a single stage?
  5. How much RAM it took?
  6. Any special hardware setup for this?

I understand that it should take a lot of time to finish boosting the classifier and I'm also aware that it should take a lot of memory and processor resources. But throughout the web, there are only a few reports using hundreds or a few thousand samples, while research papers mention they boost classifiers with hundreds of millions or even billions of samples. Is the OpenCV application currently able to handle that?

I'm currently making a trial with it using 3 million negative samples and 4000 positive samples. It has consumed all my RAM (my Linux box has 16GB) and in 36 hours it has not produced a single stage. I'm using the most recent code found in Github and I used the default compilation flags.

I also posted the same question here.

edit retag flag offensive close merge delete