I believe, somebody can explain me how OpenCV's face detection objects, DetectionBasedTracker and CascadeClassifier, make a trick in face detection.
When we train the cascade classifier, we use 24 x 24 size images. Then we have rectangles for LBP features in the xml files. So these are the rectangles with respect to 24 x 24 size trained images.
But when detection is implemented in CascadeClassifier class, features are calculated from re-sized images (not from 24 x 24 image), for those rectangles, as shown below (line 1014 at cascadedetect.cpp).
//original image size is scaled by a scale factor
if( !featureEvaluator->setImage( image, data.origWinSize ) )
return false;
Then the whole image is processed for each x,y pixels and their respective 24 x 24 windows from that re-sized image.
for( int y = y1; y < y2; y += yStep ) { for( int x = 0; x < processingRectSize.width; x += yStep ) { } }
Then the offset value is used to relate the rectangle's feature and its x,y position in re-sized image.
#define CALC_SUM_(p0, p1, p2, p3, offset) \ ((p0)[offset] - (p1)[offset] - (p2)[offset] + (p3)[offset])
What I don't understand is that -even though feature rectangles in xml file are extracted from 24 x 24 size images in training, but in real detection features are calculated from feature rectangles at re-sized images and offset is used. What does this offset value do the trick? -My thinking is that if feature rectangles are extracted from 24 x 24 size images in training, it should use 24 x 24 size images in detection as well. How is the trick behind in which features are calculated from rectangles at re-sized images? Thanks