Train an LBP head detector using opencv_traincascade [closed]

asked 2017-09-10 12:10:36 -0500

hyder gravatar image


I am trying to train an overhead head detector using LBP with the help of opencv_traincascade. I have extracted 50 positive images from a video and placed them in a folder p. Then I have created an annotation file file.txt using opencv_annotation for those 50 positive images. After that, I created a vector file info.vec using opencv_createsamplesusing -w 100 and -h 100.

I have also extracted 100 negative images from the same video and placed them in folder img and listed their names in file bg.txt. Then I ran opencv_traincascade like this:

opencv_traincascade -data data -vec info.vec -bg bg.txt -precalcValBufSize 2048 -precalcIdxBufSize 2048 -numPos 50 -numNeg 100 -nstages 20 -minhitrate 0.999 -maxfalsealarm 0.5 -w 100 -h 100 -featureType LBP

But when I run that classifier on the same video, I get A LOT of false positives. Can someone please have a look at the attached datasets ( and guide me what am I doing wrong? Maybe the resolutions of the positive and negative samples is wrong or their ratio is inappropriate.

Thank you.

edit retag flag offensive reopen merge delete

Closed for the following reason the question is answered, right answer was accepted by sturkmen
close date 2020-11-02 10:39:44.032539


i have not seen your images, but i suspect, your project is not feasible at all.

to train cascades successfully, you need a rigid, distinctly textured pattern (like a stop sign). if you just view heads from above, all you get is a dark blob.

berak gravatar imageberak ( 2017-09-12 02:13:36 -0500 )edit

@berak, but there are cascaded detectors for non-rigid objects (like face). So shouldn't it work for other non-rigid objects like head?

hyder gravatar imagehyder ( 2017-09-12 02:40:21 -0500 )edit

faces are still "rigid" in the sense, that there is a nose in the middle, a mouth at the bottom, etc.

you don't have a reliable texture, also you cannot restrict the pose to a single orientation.

(then, ofc. 50 positives won't be enough, you need more like 500, also 100x100 window size is very large, the face cascades are 24x24)

berak gravatar imageberak ( 2017-09-12 03:08:59 -0500 )edit

All the 50 positive samples are not 100x100. Infact everyone of them is greater than 100x100 and they are also not of the same resolution. One of them is 143x135, the other one is 155x160 and so on... Would that be a problem?

I have extracted them from a 640x480 video. Should I resize them all to 24x24? Wouldn't that disturb their aspect ratio?

hyder gravatar imagehyder ( 2017-09-12 03:23:56 -0500 )edit

the size of the images does not matter (they get rescaled internally), so don't do anything.

what matters is the window size (-w and-h). remember, that this is the minimal siize, that can be detected later.

but again, those details apart, you'll probably have to give up the idea of training a cascade for this

berak gravatar imageberak ( 2017-09-12 03:49:47 -0500 )edit

@berak, what do you think would be a better approach if I want to detect heads from an overhead camera using a learning based technique?

hyder gravatar imagehyder ( 2017-09-13 10:53:57 -0500 )edit