DefaultPeopleDetector: true pedestrian height

HOGDescriptor::getDefaultPeopleDetector() classifier was trained using training images of 64*128 size, but what real height pedectrians at those train data?

Performance is important, and I approximately know height of my pedestrians -- so I using HOGDescriptor::detect() (not detectMultiScale()).

Firstly I resize my target area to detector resolution (coefficient of resizing equal to ratio between "detector height pedectrian" and "my data height pedectrian"), and then perform detect function.

code:

double HeightPedestrian = 250.0;
double resizeRatio = 128.0 / HeightPedestrian;
resize(frame(RectWindow), frame_for_HOG, Size(), resizeRatio, resizeRatio);
HOGDescriptor hog;
hog.setSVMDetector(cv::HOGDescriptor::getDefaultPeopleDetector());
hog.detect(frame_for_HOG, foundPoints, 0, Size(8,8), Size(32,32));

edit retag close merge delete

Sort by » oldest newest most voted

The actual size of the persons in the Daimler images is about 75-80% of the image height (128 pixels), so if your ROI fits the person perfectly, then you need to retain more of the original image above and below the head and feet (respectively).

See examples here, along with the full description of the data set;

http://www.gavrila.net/Research/Pedestrian_Detection/Daimler_Pedestrian_Benchmark_D/Daimler_Mono_Ped__Detection_Be/daimler_mono_ped__detection_be.html

more

The detector size is actual the size of the training images. So this means that all pedestrians were resized towards a 64x128 window, meaning the heigth of your pedestrian should be around 128 pixels to use this model in a single scale detection step.

About the real height, in meters for example, it all depends on the resolution of your cam, how much distance equals a single pixel.

more

sorry, "true height" -- it is man's height at the picture by pixels. I forgot about IRL world and what "true height" is it by meters :-)

I notice, then I increase HeightPedestrian from real 250 pixels (measured by A Ruler For Windows http://www.arulerforwindows.com/ from my video, which on screen thanks to cv::imshow) to 350 -- quality of detector increase. So I propose that in resizeRatio term actually not 128.0, but lower. Are you sure what all pedestrians were resized towards a 64x128 window from top to heel, without sky and ground on top and bottom of image?

( 2014-01-17 08:56:49 -0500 )edit
1

Yes I am sure. There are many universal data sets out there that can be used for this purpose, expecially when focussing on pedestrians. Actually adding sky and ground information could lead to actually learning the background in stead of the object, the person in this case. The main rule is to remove as much background information as possible before training :)

( 2014-01-17 09:08:41 -0500 )edit

Additinal Information: The HoG Descriptor window included some amount background during training, so the pedestrian you want to detect (in single scale detection step) should be around 112 pixels height, while the other 16 pixels should contain background..

( 2017-06-09 11:39:16 -0500 )edit