Ask Your Question

Revision history [back]

You should fully grasp the concepts of cascade classifiers before making assumptions on those parameters. There is a specific reason on why the default values are set like that.

Cascade classifiers combine weaker fast performing classifiers into one (still fast) stronger classifier. These classifiers are run using a sliding window approach on a multi scale approach. As you can imagine, a single image of 1000x1000 combined with a model of 24x24 pixels (in the case of a face), can have a enormous amount sliding windows that need to be classified.

This leads to the main goal of cascade classifiers, which is two-fold

  • Try to remove as much windows from the evaluation process as soon as possible, and thus reduce the processing time for a single image.
  • If a window continues down the cascade, try to have as less feature evaluations as possible, but assuring the maximum accuracy.

Keeping all this information in the back of your head we can now see why the default values are chosen.

  • A minHitRate is the parameter that ensures us that our positive training data yields at least a decent detection output. We do not want to lower this value to much. For example a value of 0.8 would mean that 20% of our positive object training data can be misclassified, which would be a disaster. Using a rate of 1% misclassification is a common value used in research.
  • A maxFalseAlarmRate is used to define how much features need to be added. Actually we want each weak classifier to have a very good hit rate on the positives, and then to allow them to remove negative windows, as fast as possible, but doing better then random guessing. 0.5 means you apply a random guess, better than that means you successfully remove negative windows as negatives very early using only a few feature evaluations, letting other negatives be discarded by the following stages.

Since it is the waterfall principle, each negative has the chance off early rejection, ensuring that not all model features are being evaluated and thus the execution would take much longer.

So basically setting your maxFalseAlarmRate to low will yield larger weak classifiers and thus also MORE features to be evaluated in more windows before making an initial decision on that window. Since this grows exponentially, it seems legit not to increase this too much.

The number of stages is to obtain your overall performance. Because at the end you want to know how well the complete cascade is doing on the negative set (to avoid false positive detections), without backing out on the accuracy of the true positive training samples.

Got it?