Ask Your Question
3

Traincascade implementation doubts

asked 2013-01-13 11:08:36 -0600

jncor gravatar image

updated 2013-01-13 11:10:13 -0600

Hey all,

I have worked with cascade classifiers with opencv for some time. In order to train a classifier, I have been using opencv_haartraining because of the tutorials that I have found along the way (tutorial , faq). I know that it is a code that has little maintenance and has been declared obsolete, but it has been serving its purpose until now. I have also read the algorithm (Viola and Jones) to get more insight about the algorithm and understand its behaviour.

I am tempted to change into the new version (opencv_traincascade) because of the TBB implementation that it has. I get a speedup of 4-6 times doing the same training on both algorithms, i.e. a classifier trained under the same conditions using haartraining takes 5-7 hours and with traincascade (with tbb) 50 minutes. Since i am interested on training multiple classifiers it is obvious that the traincascade is most appealing solution.

But, i have been looking into the parameters, the code and the algorithm and I have found differences and left me with some doubts:

  • maxdepth ? what does this mean exactly during training? is it equivalent to the old parameter -split or -maxtreesplits?
  • in traincascade\cascadeclassifier.cpp ( CvCascadeClassifier::train ) : the value of tempLeafFARate is equal to acceptanceRatio (from CvCascadeClassifier::updateTrainingSet) , which is calculated with #False Alarms(FA) divided by #negatives consumed, is used to compare with the minimum FA rate required (wich normally is computed by FA_rate^#stages). Is this the way it should be? I mean, I understand that if we check sequentially the negative dataset and if it consumes a lot negatives examples (not finding faces in the negative set), it means that our detector is decreasing its FA rate but I always thought that it was calculated with the FA rate attained in training per stage. For instance, with a simple example with 2 stages, when training, the stage 1 finishes with:

| N | HR | FA |

.......

| 18| 0.999473| 0.48153|

and stage 2

....

| 16| 0.999472| 0.442773|

the final FA rate would be 0.48153 * 0.442773 = 0.21320848269 (based on the formula FA_rate^#stages

  • Why haartraining\cvhaartraning uses random to select the next image

    data->last = rand() % data->count;

    data->last %= data->count;

    and the traincascade does not? last is simply incremented (cvCascadeImageReader::NegReader::nextImg())

  • in traincascade\boost.cpp (setData) the random number generator is assigned rng = &cv::theRNG(); but where is it used? and if it is used, shouldn't the user have access to a parameter like -seed ?

Thanks in advance

edit retag flag offensive close merge delete

1 answer

Sort by ยป oldest newest most voted
2

answered 2013-04-18 03:41:37 -0600

cheekycheetah gravatar image

Hello,

I have also made some changes in traincascade for quite some time now. I have mainly worked on new features that could be used by Adaboost for object detection or recognition.

maxdepth : The adaboost algorithm is "meta-classifier" : it sums sucessive "weak-classifiers" confidence in the sample. In OpenCV the "weak-classifiers" are decision-trees. Depth is simply the maximum depth of each tree, so it makes each weak classifier more robust in training, but also the ensemble has higher risk of over-fitting the data. It has absolutely no connection to -maxtreesplit which is related to final number of leaves in the cascade final classifier.

For sake of clarity :

[cascade/Tree] -> nodes[AdaboostClassifier] -> WeakClassifiers[DecisionTrees] -> nodes[FeatureIdx]

in original traincascade you cannot build a Tree in the sense of 1st [].

tempLeafFARate : with this implementation (it's similar in the c version), we check the False Alarm Rate in the testing set, not in the training set. The theoretical formula FA_rate^#stages is absolutely not right in practice after some stages... in fact it has some asymptotical behavior.

For image testing i also think the CvCascadeImageReader is quite basic, but it does the job. The risk being that first stages false positives samples are almost all the same...so to use randomness is indeed better. Another possibility is to record every possible FP and sample randomly without replacement X% of the database...

edit flag offensive delete link more

Question Tools

4 followers

Stats

Asked: 2013-01-13 11:08:36 -0600

Seen: 2,232 times

Last updated: Apr 18 '13