Ask Your Question
0

Tips for training cascade classifier

asked 2016-04-17 18:17:28 -0500

WasabiFan gravatar image

updated 2016-04-18 23:39:10 -0500

I am currently trying to train a cascade classifier with custom training images, which currently consist of around 70 positives and 600 negatives.

When I run the training (using a version of OpenCV built with TBB) with a model resolution of 20x60px, an acceptance ratio threshold of .00003, and a feature type set to LBP, it takes around half an hour. When I up the resolution to 100x300, it takes 24 hours or more. This is on a 12-core Intel i7 processor ("Intel(R) Core(TM) i7-4930K CPU @ 3.40GHz, 3401 Mhz, 6 Core(s), 12 Logical Processor(s)") with 16 GB of RAM, so the fact that it takes so long is a bit confusing. The lower-resolution version gets a lot of false positives and I think that it is simply too low-fidelity for my target data. The higher-resolution version is better, but it still has problems and the insane training times are not viable for my purposes.

Is there something that I should be doing differently to train my classifier so that it doesn't take so long? I feel like I must be doing something wrong if it is taking this long.

Separately, once I have a trained classifier, I generally am getting lots of false positives while still not always detecting my target. This makes me think that my training data isn't good enough to properly identify my target.

My positive images are all cropped to similar aspect ratios so that just the target and a small amount of background are showing. The negatives are just full-size pictures of the environment with no cropping or segmentation. Is this what I should be doing? I am unfamiliar with how this classifier works internally, but I imagine that if it does some sort of comparison 1:1 of positives and negatives I might actually want to crop the negatives: even if the whole negative image does not look like my target object, a smaller section might. Is this guess correct? If not, what should I be doing to make my training results better? Is it simply a matter of getting more positives/negatives?

I was also thinking that the accuracy issues might be helped if I applied some sort of blur to both my training data and my input, but I am not sure if that is the right thing to do.

Finally, as I understand it, the classifier only operates on black-and-white versions of the input images. Is there something I can do to make it so that colors that are similar in grayscale not be confused?

Edit: My current sample creation / training command is included below. I have this wrapped as a PowerShell script in my project, but here I've inserted the values as they are evaluated. Note that I do not pass any buffer sizes. Also, I am currently running off of custom debug builds. I am aware that release builds will be faster and plan to do that the next time I ... (more)

edit retag flag offensive close merge delete

1 answer

Sort by ยป oldest newest most voted
0

answered 2016-04-18 04:42:38 -0500

Lets split this question up into usefull parts. Lets start with the resolution of the training patches and the longer training time

  • The latest version of the opencv_traincascade application returns you the number of possible features using a specific window size. This will shed light on your problem. But the main line is, if your resolution increases, the number of features explode exponentially.
  • Considering LBP features, a 20x60 pixel resolution gives you 37.170 unique features. For each training sample and each negative window, these need to be calculated and stored in memory so that they can be evaluated.
  • A 100x300 resolution image gives you 24.667.500 unique features ... they take a hell lot of time to calculate and evaluate one by one to find the best scoring one during the adaBoost process. So in my opinion, 24 hours is extremely fast.
  • Just to compare to HAAR wavelets, which yields even more features, a 100x300 model even brings down a bad memory allocation on my 32GB RAM, 24 core machine ... so I am not even sure I want to calculate this.

So no, it is not weird that it takes 24 hours to process that data. 24 hours is btw not insane. I have models that train for multiple days, before returning me a result...

What do you pass to the precalcValBufSize and precalcIdxBufSize parameters? Increasing those can already help alot! But to my opinion your resolutions is way to large!

lots of false positives while still not always detecting my target

  • You need more positives, because you cannot locate your object of interest
  • You need more negatives to reduce the amount of false positives

Finally, please add your complete training command and output of the start of your training, because it can be that you are misusing some of the parameters.

edit flag offensive delete link more

Comments

I have updated my question to include my command-line invocation. I can't run a training session right now to get the output for you, but I should be able to within the next few days. Some questions: How can I retrieve that info on the number of features? How many positives/negatives should I aim for? Should my negatives just be whole views, or should I focus on individual objects that might be misidentified (or are returning false positives)?

WasabiFan gravatar imageWasabiFan ( 2016-04-18 23:38:06 -0500 )edit

@WasabiFan thank you for the command! I will take a look at it in a second. About your questions

  • The number of features is returned automatically IF you take the latest master/2.4 branch from github and build OpenCV yourself. I probably added this a week or 3 ago.
  • Number of training samples depends on your application. I have setups of 200pos500neg but also setups of 1000pos10000neg. I always start with a pos:neg ratio of 1:2 and then increase samples depending on the output.
  • Start with whole images as a first run, then store the false positive detections and add those to the beginning of your negative set as hard negatives.

Good luck!

StevenPuttemans gravatar imageStevenPuttemans ( 2016-04-19 03:54:05 -0500 )edit

Looking at your command, depending on which OpenCV version your are using, the process will only use 256MB to max 1024MB of RAM for each buffer. On a machine with 16GB of RAM this means you need a huge local storage of in between results. Increasing it will definately help!

StevenPuttemans gravatar imageStevenPuttemans ( 2016-04-19 04:12:35 -0500 )edit
Login/Signup to Answer

Question Tools

1 follower

Stats

Asked: 2016-04-17 18:17:28 -0500

Seen: 1,008 times

Last updated: Apr 18 '16