Train an LBP head detector using opencv_traincascade [closed]
Hello,
I am trying to train an overhead head detector using LBP with the help of opencv_traincascade
. I have extracted 50 positive images from a video and placed them in a folder p
. Then I have created an annotation file file.txt
using opencv_annotation
for those 50 positive images. After that, I created a vector file info.vec
using opencv_createsamples
using -w 100
and -h 100
.
I have also extracted 100 negative images from the same video and placed them in folder img
and listed their names in file bg.txt
. Then I ran opencv_traincascade like this:
opencv_traincascade -data data -vec info.vec -bg bg.txt -precalcValBufSize 2048 -precalcIdxBufSize 2048 -numPos 50 -numNeg 100 -nstages 20 -minhitrate 0.999 -maxfalsealarm 0.5 -w 100 -h 100 -featureType LBP
But when I run that classifier on the same video, I get A LOT of false positives. Can someone please have a look at the attached datasets (https://www.dropbox.com/s/av44rjxplni...) and guide me what am I doing wrong? Maybe the resolutions of the positive and negative samples is wrong or their ratio is inappropriate.
Thank you.
i have not seen your images, but i suspect, your project is not feasible at all.
to train cascades successfully, you need a rigid, distinctly textured pattern (like a stop sign). if you just view heads from above, all you get is a dark blob.
@berak, but there are cascaded detectors for non-rigid objects (like face). So shouldn't it work for other non-rigid objects like head?
faces are still "rigid" in the sense, that there is a nose in the middle, a mouth at the bottom, etc.
you don't have a reliable texture, also you cannot restrict the pose to a single orientation.
(then, ofc. 50 positives won't be enough, you need more like 500, also 100x100 window size is very large, the face cascades are 24x24)
All the 50 positive samples are not 100x100. Infact everyone of them is greater than 100x100 and they are also not of the same resolution. One of them is 143x135, the other one is 155x160 and so on... Would that be a problem?
I have extracted them from a 640x480 video. Should I resize them all to 24x24? Wouldn't that disturb their aspect ratio?
the size of the images does not matter (they get rescaled internally), so don't do anything.
what matters is the window size (-w and-h). remember, that this is the minimal siize, that can be detected later.
but again, those details apart, you'll probably have to give up the idea of training a cascade for this
@berak, what do you think would be a better approach if I want to detect heads from an overhead camera using a learning based technique?