Hi, I'm trying to implement pedestrian detection algorithm based on this paper.
Currently I'm trying to train detectors for human body parts. Using HOG detector in openCV I detected pedestrians in INRIA dataset and cropped pictures. So now I have upper_body, head_shoulder, full_body, left_body, right_body classes. E.g., after detecting a pedestrian with HOG detector I did this to get left_body class from the current sample:
left_body = img[y+pad_h:y+h-pad_h, x+pad_w:x+int(0.6*w)-pad_w]
I trained a cascade classifier using lbp features on 1000 positives and 1000 negatives (currently only for head_shoulder class). When I try to run the detectMultiScale method of CascadeClassifier object on high resolution pictures I get innacurate detections. If I downscale the image I get better detections. I am not sure how detectMultiScale works. I know it downsamples the image and does detections with a sliding window on each scale.
I trained cascade classifier with positives listed as this:
./positive_images/head_shoulder/head_shoulder869.png 1 0 0 48 48 ./positive_images/head_shoulder/head_shoulder1605.png 1 0 0 48 48 ./positive_images/head_shoulder/head_shoulder2030.png 1 0 0 48 48
Each positive image is the object itself and all images are the same size. How does detectMultiscale do detections on smaller scales if I trained a detector of size 48x48?
I run the detectMultiScale method with these parameters:
self.cascade.detectMultiScale(img, scaleFactor=1.3, minNeighbors=4, minSize=(30, 30), flags = cv2.CASCADE_SCALE_IMAGE)
Also, if someone is acquainted with the paper I linked I would very much appreciate any opinions.
Thank you