Ask Your Question

How to train classifiers the best way possible

asked 2016-07-21 07:51:15 -0500

eTicket gravatar image

updated 2016-07-21 15:03:29 -0500


i am trying to train a cascade classifier with opencv_traincascade. There are a few tutorials and posts about that, but all of them seem to be either outated or are contradictory. Also i am interested in doing things the best way possible, not just getting it to work. I am going to tell you what i learned so far, and what is unclear to me.

Acquiring positive samples

There are two methods:

  • Using a single (or few) images and use OpenCV's createsamples utility to generate a lot of samples from them. In case of more then one image, the resulting *.vec files are merged with a special utility.

  • Using multiple annoted images and feeding them to createsamples with the -info option. This does not create more samples, it just extracts them.

The first approach seems to be used by a lot of people, mainly because you dont need annoted images and can create a large number of samples from just a few images. But I've read that the first approach is really bad and one should use the seccond approach because it provides real world samples that are more likely to match the input images in a real application. Some people say that taking a smaller number of real world images is way better then using a really large number of generated artificial samples.

I currently have about 300 annoted samples at hand. I could directly feed them to createsamples (with -info).

  • 300 samples are not a huge amount, but those are "real world" samples. Referring to the above statements i might not need more. By how much are real world samples actually better then generated ones? Are 300 samples enough?
  • Otherwise would it make sense to create about 10 artificial samples per real world sample using createsamples and merge those *.vec files? This way i would have 3000 samples.

Acquiring negative samples

Most people use random images from the internet as negatives to train. But i read throught this Article which is suggesting a better approach:

Creating negatives from the backgrounds of the positives is much more “natural” and will give far better results, than using a wild list of background images taken from the Internet.


I would avoid leaving OpenCV training algorithm create all the negative windows. To do that, create the background images at the final training size so that it cannot subsample but only take the entire negative image as a negative.

If i understand this right, the author suggests to take the images containing the object and extract negative samples from the regions around the object, also making sure those extracted negatives have the same size as the training window.

  • Does that make sense? I looked at his git repository and his negatives where a lot larger then the training window. Do i miss something?

  • Also, should i extract regions with the window size out of the images, or could i use larger portions with the same aspect ratio and scale ...

edit retag flag offensive close merge delete


Maybe @StevenPuttemans can evaluate your parameters, he has more experience with traincascade.

Pedro Batista gravatar imagePedro Batista ( 2016-07-21 12:09:01 -0500 )edit

@Pedro Batista, thank you for calling me to this topic :) About the parameters, I will take a look and add more info below your answer if needed!

StevenPuttemans gravatar imageStevenPuttemans ( 2016-07-22 05:27:17 -0500 )edit

1 answer

Sort by » oldest newest most voted

answered 2016-07-21 09:40:24 -0500

updated 2016-07-21 09:59:38 -0500

Deciding to use the createsamples utility or not depends on your problem.

OpenCV generates artificial samples by rotating, scaling and applying other transformations in the positive samples. This might be useful if the object you are trying to find is a rigid one on fairly predictable backgrounds (like logos on websites, for example).

If the object you are trying to find is complex, such as a face, person, car, on complex environments (real-life), the createsamples utility is pretty useless and you should acquire all the samples from real examples.

As for negative examples, it should be fairly obvious that if you use negative images (images in which there are not positive samples) from the background the classifier will perform on, it will improve its accuracy. For example, if you are building a pedestrian detector to work on an urban setting, you'll want to add negative examples from urban scenarios with no pedestrians.

There is no obvious rule for positive/negative ratio. Normally I add around twice as much negative samples than positive ones, but never less than that. And it makes sense because any classifier will give much more negative responses than positive ones on a common scenario. It is much easier to add negative samples to your cascade, since you only need to add negative images and define how much negative samples the cascade should use, OpenCV will do the rest.

As to why the training isn't finishing, there might be a lot of different reasons. You might be choosing to big a Detection Window, which will require so much memory that the program will crash, you can be requiring too big a success rate in the training parameters. Also, there is a maximum number of negative examples that depends on the number of positive samples, try to reduce the number of negative samples in your parameters. Also, tell us the rest of the parameters you are using to train the cascade.

edit flag offensive delete link more


First of all thank you for your answer. I have edited my question and added my training parameters. Could you tell me if my way of extracting the negative images is ok? As i've written, i have a software that extracts them directly out of the positive images. My negative regions that i am selecting have the same size as the object in the image. After extraction my software scales them down to the train window size. So the images in negatives.txt all have the dimensions 40x40. Also given the complexity of my fried egg example, could you guess how many images would be needed. How do i know if i have too less? I am going to test your suggestion to decrease the number of negatives. I'll get back when i tried that.

eTicket gravatar imageeTicket ( 2016-07-21 12:13:39 -0500 )edit

For negative images you don't need to do such operations, Just add paths to full images with normal size in your negatives.txt (just make sure the object you are trying to find is not present), like 400 or 500. Then if you tell traincascade to use 2000 negative examples, it will go over those 400/500 images and select 2000 negative examples from them with the appropriate dimensions, get it?

Pedro Batista gravatar imagePedro Batista ( 2016-07-21 12:31:48 -0500 )edit

@PedroBatista Yes i know. I am just referring to the article, which suggests to extract the negatives from the images where the object is present. That (in theory) is better because the negatives will be more related to the positives. I am particularly unsure about the scaling operation though. OpenCV scales the positive samples to the training size too. So for the negatives i thought extracting unrelated regions with the same size as the object and scaling those down to the training size makes sense. Its not that its not possible for me to find negative images with similar backgrounds, but its actually harder than you might think.

eTicket gravatar imageeTicket ( 2016-07-21 12:47:06 -0500 )edit

Subtracting the negatives beforehand can be tricky. You need a small extra border around the model to make it able to select a window. This is due to the internal rounding of the pyramid scale. However I am not exactly sure how many pixels you need as border. In my opinion, sampling them yourself is overkill, and I always avoid doing so. As to grab meaningfull negatives, run a basic detector on a negative set, and collect the hits, then those are called hard negatives and will be better to train on. As to parameter selection, please read all topic in this forum, they have been explained like thousand of times! :D

StevenPuttemans gravatar imageStevenPuttemans ( 2016-07-22 05:32:00 -0500 )edit

And if you do not want to roam the forum, I wrote a complete chapter on this in OpenCV 3 Blueprints! An excellent reference work on cascade classifier training using OpenCV in my opinion.

StevenPuttemans gravatar imageStevenPuttemans ( 2016-07-22 05:33:12 -0500 )edit

@StevenPuttemans thank you, i will look into your reference, looks really useful at first glance! Regarding the nagtive window extraction, i am going to do it the normal way then. Like that the training works as expected.

eTicket gravatar imageeTicket ( 2016-07-23 11:49:32 -0500 )edit

Question Tools



Asked: 2016-07-21 07:51:15 -0500

Seen: 5,064 times

Last updated: Jul 21 '16