# Using the -w and -h parameters of the createsamples utility for cascaded training

0 down vote favorite

So I've come across lots of tutorials about OpenCV's haartraining and cascaded training tools. In particular I'm interested in training a car classifier using the createsamples tool but there seem to be conflicting statements all over the place regarding the -w and -h parameters, so I'm confused. I'm referring to the command:

\$ createsamples -info samples.dat -vec samples.vec -w 20 -h 20


I have the following three questions:

• I understand that the aspect ratio of the positive samples should be the same as the aspect ratio you get from the -w and -h parameters above. But do the -w and -h parameters of ALL of the positive samples have to be the same size, as well? Eg. I have close to 1000 images. Do all of them have to be the same size after cropping?

• If it is not the size but the aspect ratio that matters, then how precisely matching must the aspect ratio be of the positive samples, compared to the -w and -h parameters mentioned in the OpenCV tools? I mean, is the classifier very sensitive, so that even a few pixels off here and there would affect its performance? Or would you say that it's safe to work with images as long as they're all approximately the same ratio by eye.

• I have already cropped several images to the same size. But in trying to make them all the same size, some of them have a bit more background included in the bounding boxes than others, and some have slightly different margins. (For example, see the two images below. The bigger car takes up more of the image, but there's a wider margin around the smaller car). I'm just wondering if having a collection of images like this is fine, or if it will lower the accuracy of the classifier and that I should therefore ensure tighter bounding boxes around all objects of interest (in this case, cars)?

edit retag close merge delete

Sort by » oldest newest most voted

Actually, the createsample utility will just do a stupid crop operation to your specified -w and -h parameter resulting in a vector of positive training samples all containing the same dimensions, no matter what the original size of the training data was.

If your ratio of your original image was for example w/h = 2/1 and you pass a w/h ratio to the tool of 1/1 then you can definitately understand that your image representation of your object will be wrong and highly deformed, a somewhat pushed together car in width.

And also, this -w and -h parameter will determine the ratio of your detection ratio, which will mean that he will only find cars with a 1/1 ration and not the normal cars that have a 2/1 ratio.

I suggest changing your parameters to for example -w 40 -h 20 and the result will be perfectly working :)

more

Hey StevenPuttermans, I have a similar question - but I'm detecting faces. I have a bunch of photographs taken and stumped by the -w and -h requirements. These photographs have varying sizes of faces due to the distance between the camera and face and also lens effect. How should I look into fitting in the ratio for -w and -h? Question is, do i need to copy and paste each face individually? what about the ratio if some faces are bigger than others?

( 2014-03-05 01:44:37 -0500 )edit

Is -w and -h in pixels?

( 2014-03-05 02:03:35 -0500 )edit

Both parameters are in pixels. About different rations, you should define an average ratio and scale to that one. This will make sure that slight deformations will help to improve your results, since not a single two faces have nearly identical dimensions. So what i always do is run through all cropped images, then select average width and height and use that to define the model sizes.

( 2014-03-10 07:16:34 -0500 )edit

Official site

GitHub

Wiki

Documentation