Ask Your Question

Revision history [back]

Lets split this question up into usefull parts. Lets start with the resolution of the training patches and the longer training time

  • The latest version of the opencv_traincascade application returns you the number of possible features using a specific window size. This will shed light on your problem. But the main line is, if your resolution increases, the number of features explode exponentially.
  • Considering LBP features, a 20x60 pixel resolution gives you 37.170 unique features. For each training sample and each negative window, these need to be calculated and stored in memory so that they can be evaluated.
  • A 100x300 resolution image gives you 24.667.500 unique features ... they take a hell lot of time to calculate and evaluate one by one to find the best scoring one during the adaBoost process. So in my opinion, 24 hours is extremely fast.
  • Just to compare to HAAR wavelets, which yields even more features, a 100x300 model even brings down a bad memory allocation on my 32GB RAM, 24 core machine ... so I am not even sure I want to calculate this.

So no, it is not weird that it takes 24 hours to process that data. 24 hours is btw not insane. I have models that train for multiple days, before returning me a result...

What do you pass to the precalcValBufSize and precalcIdxBufSize parameters? Increasing those can already help alot! But to my opinion your resolutions is way to large!

lots of false positives while still not always detecting my target

  • You need more positives, because you cannot locate your object of interest
  • You need more negatives to reduce the amount of false positives

Finally, please add your complete training command and output of the start of your training, because it can be that you are misusing some of the parameters.