# opencv_traincascade Parameters explanation, image sizes etc

Hello guys, I am trying a long time now to train a descent classifier and to be able to get reliable results from an object detect script. I was trying to follow this tutorial: http://coding-robin.de/2013/07/22/train-your-own-opencv-haar-classifier.html (which by the way was very helpful). But during the process a lot of questions was born. I will try to ask all of them now and I think the answers will help a lot of people not just me.

1. Negatives positive images:

a. How many negatives and positive images should I have?

b. Should positives be more than negatives? If yes what is the best proportion between negatives and positives?

c. Is there a preferable format for the pictures (bmp, jpg, png etc)?

d. What should be the size of negative pictures and what should be the size of positive images? Lets say my negative images are 640x320, and the "to be detected" object is 100x50. In negatives folder the images should all be 640x320?In positives folder should be 640x320 cropped images with visible on the object?Or should i place in positives folder images of 100x50 with the object only?

e. Cropping positives images should be clear everything from background? Or should I use just rectangle around object, including some of the surrounding background?

f. I tried to use the "famous" imageclipper program, with no luck. Does anyone done it? Is there any walk through tutorial to install this program?

g. Opencv_createsamples: Is it necessary? How many samples should I use? About -w and -h I read a lot of tutorials on line that their saying that this should be proportional to real images. So all of my positive images should have exactly the same size? If my positive images are 100x50 and if i use as paramters -w 50 -h 25, images will be cropped or decrease? This is going to affect the training and finally the detection procedure?

2. opencv_traincascade: Below are all the parameters:

-vec <vec_file_name>

-bg <background_file_name>

[-numPos <number_of_positive_samples =="" 2000="">]

[-numNeg <number_of_negative_samples =="" 1000="">]

[-numStages <number_of_stages =="" 20="">]

[-precalcValBufSize <precalculated_vals_buffer_size_in_mb =="" 256="">]

[-precalcIdxBufSize <precalculated_idxs_buffer_size_in_mb =="" 256="">]

[-baseFormatSave]

[-stageType <boost(default)&gt;]< p="">

[-featureType <{HAAR(default), LBP, HOG}>]

[-w <samplewidth =="" 24="">]

[-h <sampleheight =="" 24="">]

--boostParams--

[-bt <{DAB, RAB, LB, GAB(default)}>]

[-minHitRate <min_hit_rate> = 0.995>]

[-maxFalseAlarmRate <max_false_alarm_rate =="" 0.5="">]

[-weightTrimRate <weight_trim_rate =="" 0.95="">]

[-maxDepth <max_depth_of_weak_tree =="" 1="">]

[-maxWeakCount <max_weak_tree_count =="" 100="">]

--haarFeatureParams--

[-mode <basic(default) |="" core="" |="" all<="" p="">

--lbpFeatureParams--

--HOGFeatureParams--

Can anyone explain all of these, what are they doing, how can affect the training and the detection.

1. Training:

During training i am getting those: ===== TRAINING 0-stage =====

POS count : consumed 400 : 400 NEG count : acceptanceRatio 1444 : 1 Precalculation time: 12 +----+---------+---------+ | N | HR | FA | +----+---------+---------+ | 1| 1| 1| +----+---------+---------+ | 2| 1| 1| +----+---------+---------+ | 3| 1| 0.454986| +----+---------+---------+

Training until now has taken 0 days 0 hours 20 minutes 11 seconds.

Can anyone explain that table and all the other information?

1. After training. I have trained my classifier for 5 stages, and was able to find some objects on image (with a lot ...
edit retag close merge delete

1

I think you should start with the search button on train cascade, classifier training, parameters and so on ... this topic has been discussed about a 1000 times with nice and long explanations (including from myself). There is no use in repeating it over and over again...

( 2014-08-12 05:57:49 -0500 )edit
1

And start reading the results of this search query. Once you have read everything, which will take you up to an hour of 3, you will fully grasp everything.

( 2014-08-12 06:01:41 -0500 )edit

@tomnjerry, are you the topic starter? If so it is somewhat unclear to me what you edited and what questions still stand. Can you please provide a clear update in comments?

( 2015-04-10 02:38:56 -0500 )edit
1

Just a few minor edits. Few sentences in the problem were not clear which made question obscure. Also at few places, the words were incorrect. Instance, at 1. Negatives positive images: b. Should negatives be more than negatives? If yes what is the best proportion between it and negative? Such minor changes were made. No changes in problem statement!

( 2015-04-10 03:30:57 -0500 )edit

Did you read the links I provided? I will get back to this later today with some explanations.

( 2015-04-10 04:37:30 -0500 )edit

Sort by » oldest newest most voted

How many negatives and positive images should I have?

• It all depends on your application so start by clearly defining what application area you have
• Things to consider in this case are the amount of variation in object instances, the fact if you have a known/static background or not, is your lighting controlled or not, ...
• But a general rule that works for a variety of setups is to take a numPos:numNeg ration of 1:2.
• Keep in mind that increase the number of positives will increase the generalization of your model, it will look for better general features and will less likely overfit on your training data. Increasing the number of negative images is needed to remove the large amount of false positive detections. You want to supply as many situations as possible that don't look like your object here! Generally speaking they supply thousand of samples here to try to model general background noise for your application.
• Always keep in mind to carefully select your training data. It is better to 100 good natural occuring samples, than to use 1 good image and transform it 100 times with the available tools.

Should positives be more than negatives? If yes what is the best proportion between negatives and positives?

Like said above that kind of depens on your application. A ratio of 1:2 is good but I have applications where there are ratios of 100:1 and applications where the ratio is 1:100. It all makes sense once you know what training does for your application.

Is there a preferable format for the pictures (bmp, jpg, png etc)?

Actually there is not a general rule for this, but I always suggest users to use a file format that doesn't have lossy compression, like png. This will ensure that you do not incorporate compression artefacts as actual features of the object. This is especially the case when resizing your training data.

What should be the size of negative pictures and what should be the size of positive images?

Lets say my negative images are 640x320, and the "to be detected" object is 100x50. In negatives folder the images should all be 640x320?In positives folder should be 640x320 cropped images with visible on the object?Or should i place in positives folder images of 100x50 with the object only?

In your positives folder you keep images that contain objects. Then your positives.txt file will be formatted image_location number_objects x1 y1 w1 h1 x2 y2 w2 h2 ... xN yN wN hN. This means that those regions will be cut out by the create samples tool and then resized to the model dimensions given at the -h and -w parameters of the tool. For your negative images just supply a folder with tons of images that are larger than your model size your selected. During training negative windows will get sampled from those larger images. At training time the -numNeg windows can thus be quite larger ...

more

Official site

GitHub

Wiki

Documentation