I'm fairly new to OpenCV, working on a fun personal project -- using a webcam + OpenCV on a Raspberry Pi to detect birds in my fig tree and trigger a scare device. But I'm hitting serious roadblocks trying to train a simple LBP cascade.

I have a simple test app which runs detectMultiScale on a few test images. Using the standard lbpcascade_frontalface.xml the test successfully detects faces within ~200 milliseconds, which is awesome.

Then I took 11 different photos of a square cookie cutter and cropped them to 24x24, obtained 20 random negative images cropped to 200x200, and created 12 test images at 320x240. See the images here.

I then trained the cascade.xml with:

opencv_traincascade -data data -vec vec -bg bg.txt -featureType LBP -w 24 -h 24


and run my detection with:

bird_cascade.detectMultiScale(gray, birds, 1.1, 2, 0, Size(80, 80));


Results:

• the xml file is 390KB, very large compared to lbpcascade_frontalface.xml which is 52KB. Why does this much smaller data set result in a much larger xml file??
• detectMultiScale takes ~42 seconds per image, about 250 times slower than face detection. Again, completely opposite of what I'd expect.
• detectMultiScale doesn't detect any of the star shapes, even on images which contain the exact shapes from the positives training set... rather it always appears to detect a single 104x104 match in the center of the image, regardless of the image :(

Things I've tried in training which haven't made any difference:

• specifying numPos 11 and numNeg 20
• specifying maxFalseAlarmRate of 0.95 (just stabbing in the dark here, haven't found any online docs that explain this very clearly)
• specifying numStages 10 (this actually reduced the xml file size to 220KB and detection time to 33 seconds, but that's still awful)
• using a single star image + opencv_createsamples to produce a vec file with 2000 positives, and supplying 100 negatives. This took over 3 hours to train, but the xml is 505KB and detection takes ~70 seconds, with no successes.

As you can see, I've spent a lot of time but am getting nowhere. I've studied several online tutorials/references but none addresses the total failure I'm encountering.

Any of the following would be EXTREMELY helpful:

• specific insights as to why I'm getting complete failure in my case, and which bits to alter to achieve success.
• a thorough explanation of the LBP algorithm which would give some intuition for dialing in the mysterious training/detection parameters such as numPos, numStages, minHitRate, maxFalseAlarmRate, scaleFactor, minNeighbors (I'm having trouble making sense of this write-up by Maria Dimashova)
• pointer to a good starter tutorial for LBP training -- the standard docs are pretty weak
• I want to start small, but is it even possible to produce a 'test' cascade that doesn't require hundreds or thousands of positives & negatives?

Ken

edit retag close merge delete

My guess is the data chosen for training is not correct. The training is suffering too much from the variance in the data set and trying to add more features to discriminate between them and ending up being a BIG FAT cascade. Please Note that Its a guess.

And I really want to appreciate your idea. Nice work.

And i would suggest that instead of making a very robust detector, make one for the situation at hand. Take natural positives and negatives from the scene where it is used. You can make a cascade with few images, but you wont be able to capture all possible 'test' cases with a small dataset.

One more thing. You can also try other feature evaluators other than LBP like HOG,HAAR etc. You can also try SIFT, SURF, etc. too.

( 2013-07-12 06:52:22 -0500 )edit

If you need more help, contact me at samthedevil.sp@gmail.com. I can share my experience with cascade classification with you

( 2013-07-12 06:55:17 -0500 )edit

Thanks Prasanna, based on your post I tried training a classifier with 10 frontal faces fairly well aligned (see http://goo.gl/udv7p)... but strangely I'm getting the same crazy results: a 380KB cascade.xml file, 30-40 seconds per call to detectMultiScale, and always one false positive result as a 104x104 square in the center of the image. I'll contact you offline, thanks for offering to assist!

( 2013-07-12 15:38:45 -0500 )edit