Cascade training - negative images

asked 2014-09-13 18:31:51 -0500

_chris gravatar image

updated 2014-09-15 02:58:25 -0500

Using the create samples application I can create an arbitrary amount of positive images from a small set (whilst not always the best option, it does save time). My first question is ideally what should the ratio be between positive and negative images (including the samples generated) - and does that affect the run time of opencv_train cascade. Im currently running a trainer with POS count = 1000, NEG count acceptanceRatio 750:1.

My other questions are about the quality and size of negative images. Is there any advantage to having high resolution (several MB) negatives if the ultimate application of the detector is a Kinect camera? Also, does it matter what the negative images are? I just used random holiday photos for my first test and the quality of my detector wasnt brilliant.

EDIT: The application is ultimately to detect various sizes of coffee cups in a crowded environment with varying lighting to do some 2d-3d mapping etc. As I understand, as the cups will all have rigid distinct geometric features with little variance, I don't entirely need to create a lot of distorted images with create_samples. I think it would be better to have a very wide negative library to cover for the varied lighting.

Another question - how well should the positives be cropped?

edit retag flag offensive close merge delete


Its a rather broad question which can only urge me to give you a broad solution. It all depends on what you are trying to achieve. So can you describe your application? Where do you want it to work, what are possible backgrounds, ...? I have good models with POS:NEG sample ratio of 1000:5000 but even with 250:10 it really depends on what you are trying to achieve...

StevenPuttemans gravatar imageStevenPuttemans ( 2014-09-14 06:54:15 -0500 )edit

I had the same question and I learned that as Steven says, it depends on your application. The negative images should be representative of your application, so if you want to detect faces in holiday pictures, then having holiday pictures (without faces) should be a good idea. If you are using kinect, maybe indoor pictures might be more representative. Also as I understand it traincascade will use your negative images to subsample the actual negative pictures. However if I understand the source code correct, it will start subsampling with a window size that is equal to the window size of your positive images, so it might be a disadvantage if your negative images have a higher resolution as it will start subsampling with very small regions of your overall image.

aKzenT gravatar imageaKzenT ( 2014-09-14 09:46:07 -0500 )edit

okay, ill edit with more detail. thanks guys.

_chris gravatar image_chris ( 2014-09-15 02:55:34 -0500 )edit