Revision history [back]

It may not be an answer as such but its too long for a comment. So here it goes.

CreateSamples

You can actually use this utility in four ways. You can use it for :

Creating Distorted Training samples - Perspective Transform , Intensity Variation, etc.
Undistorted Training samples - Just a rectangular crop, incase you have too many samples already
Annotated Test Samples - Useful for testing performance of your cascade after training
And to see whats there in a Vec file (provided you know the size of the samples used to create it) - Useful for seeing and verifying what you really want is there in it.

Traincascade

It's pretty much covered in the OpenCV Documentation. Let me point you to another link. One of the best and detailed tutorial for HAAR training and can be easily extended to LBP and HOG with ease, thanks to OpenCV Developers

Regarding Size and to know why it blows up go check the haarfeatures.cpp in $OpenCV_DIR/apps/traincascade/ and check the "void CvHaarEvaluator::generateFeatures()" and try calculating how many "sums" you are creating per image and add the size of integral images to that also. That should give you a fair idea about the requirement of memory (thats not the end actually. There are still other things occupying memory but this is the major chunk).

And the problem is it needs everything in the memory to get the best features within the window which can discriminate between the positives and negatives such that most of the positives are correctly classified (currentDetectionRate > minHitRate is achieved) and very little amount of negatives are mis-classified (currentFalseAlarmRate < maxFalseAlarmRate is achieved) and a suitable threshold is set such that all conditions are satisfied. That's the naive definition for the working of cascade classification. It continues adding features to the stage so that the defined Rates are achieved. The training stops if defined maxFalseAlarmaRate ^ numStages is achieved before required stages. (In your case, for numstages = 200 for minFA = 0.5, its 10^-61) or numstages is reached.

And well, the following is obvious. You test your classifier. It slides a defined window(533 x 117 in your case) and looks at the location defined by the stage classifier, computes HAAR, > stage threshold ? test next stage : reject the window and slide to the next position and blah blah blah. It is all coded up in the facedetect.cpp in samples/c and you just have to call it by giving the cascade name.

Phew, that was long. Hope I didnt go wrong in the middle. (Please edit it if I have)

Things to note - Use small sample size for creating samples. Try LBP and HOG too. Go through that tutorial above (read it completely atleast once. Note: It is written for HAARTRAINING,the old buddy before traincascade. Make sure you take care of the changes). You must also not forget to manually check the parameters getting set. You wont be warned if the params are wrong. Check what it prints atleast twice and see if it matches your requirements. (as far as I know, there vec file cant be info.dat can it?) and dont forget to merge your vecs before training.

The pipeline I used :

Get distorted samples from original images and generate vec file for each image using createsamples utility
Merge the vec files into one ( and check them if you want. How to -> in the tutorial)
Get negatives and put relative paths of negative images with respect to location of bg.txt in bg.txt
Check and double check the parameters passed to traincascade utility.
Do something else useful while this runs.

Hope this helps.

Regards,

Prasanna S

It may not be an answer as such but its too long for a comment. So here it goes.

CreateSamples

You can actually use this utility in four ways. You can use it for :

Creating Distorted Training samples - Perspective Transform , Intensity Variation, etc.
Undistorted Training samples - Just a rectangular crop, incase you have too many samples already
Annotated Test Samples - Useful for testing performance of your cascade after training
And to see whats there in a Vec file (provided you know the size of the samples used to create it) - Useful for seeing and verifying what you really want is there in it.

Traincascade

And the problem is it needs everything in the memory to get the best features within the window which can discriminate between the positives and negatives such that most of the positives are correctly classified (currentDetectionRate > minHitRate is achieved) and very little amount of negatives are mis-classified (currentFalseAlarmRate < maxFalseAlarmRate is achieved) and a suitable threshold is set such that all conditions are ~~satisfied.~~ satisfied in that stage and continues to the next stage till global requirements are met. That's the naive definition for the working of cascade classification. It continues adding features to the stage so that the defined Rates are ~~achieved.~~ achieved and moves on to the next stage. The training stops if defined maxFalseAlarmaRate ^ numStages is achieved before required stages. (In your case, for numstages = 200 for minFA = 0.5, its 10^-61) or numstages is reached.

Phew, that was long. Hope I didnt go wrong in the middle. (Please edit it if I have)

The pipeline I used :

Get distorted samples from original images and generate vec file for each image using createsamples utility
Merge the vec files into one ( and check them if you want. How to -> in the tutorial)
Get negatives and put relative paths of negative images with respect to location of bg.txt in bg.txt
Check and double check the parameters passed to traincascade utility.
Do something else useful while this runs.

Hope this helps.

Regards,

Prasanna S