Ask Your Question

Revision history [back]

click to hide/show revision 1
initial version

About opencv_traincascade....

Hello,

I am currently trying to train my own cascade based on Naotoshi Seo's tutorial and Codin Robiin's tutorial. I still have a couple of questions that I haven't find answers.

I will have some objects to detect in a controlled scenario (it will always be in the same room and the objects will not vary). Therefore, my idea is to grab my camera and save frames for an certain amount of time of the room with NONE of the objects present to gather the negative images. Then I would get the objects I want to detect on a turning table (one at the time, of course...), set the camera on a tripod and for different heights choose a ROI surrounding the object plus choosing when I want to start and stop saving the image, I can make the object rotate. Thus, I would have several views of the same objects from different angles plus I can get the X, Y position plus the size of the bounding box and easily save the file with the path, number of objects in the scene plus these four parameters to create the .vec file.

My questions are:

  1. I should save the images as grey scale, right?
  2. Which resolution should I save the negative images? (My camera is 1280x1024) Original or resized to.....?
  3. Should I save the entire image or just the ROI for the positive images?

I'd like to test this because as a first approach I took a picture of an object with my phone, cropped it and removed the background image description (50x50 grey scale image) and with opencv_createsamples plus the negatives that I took as described before (saved as grey scale 100x100).

Then to got my positive samples for training I run:

opencv_createsamples -img mouse1Resized.png -bg bg.txt -info pos/info.lst -jpgoutput pos/ -maxxangle 0.5 -maxyangle -0.5 -maxzangle 0.5 -num 1690 - bgcolor - -bgthresh 0

where 1690 is the number of negative images that I captured. Then I create the vec file with:

opencv_createsamples -info pos/info.lst -num 1690 -w 20 -h 20 -vec positives.vecInfo file name: pos/info.lst

And start training with:

opencv_traincascade -data data -vec positives.vec -bg bg.txt -numPos 1400 -numNeg 700 -numStages 15 -w 20 -h 20

When this finished, I tied the detector and I got a LOT of false positives, even when the object were not in the scene.

So here are some more questions.

  1. Should the negatives be 100x100?
  2. Should the positive be 50x50?
  3. When I create the .vec file, how large can -w and -h be?

I would like to test best approaches to see which gives the best results....or based on your experience, which one should I follow?

Thanks for the help.

About opencv_traincascade....

Hello,

I am currently trying to train my own cascade based on Naotoshi Seo's tutorial and Codin Robiin's tutorial. I still have a couple of questions that I haven't find answers.

I will have some objects to detect in a controlled scenario (it will always be in the same room and the objects will not vary). Therefore, my idea is to grab my camera and save frames for an certain amount of time of the room with NONE of the objects present to gather the negative images. Then I would get the objects I want to detect on a turning table (one at the time, of course...), set the camera on a tripod and for different heights choose a ROI surrounding the object plus choosing when I want to start and stop saving the image, I can make the object rotate. Thus, I would have several views of the same objects from different angles plus I can get the X, Y position plus the size of the bounding box and easily save the file with the path, number of objects in the scene plus these four parameters to create the .vec file.

My questions are:

  1. I should save the images as grey scale, right?
  2. Which resolution should I save the negative images? (My camera is 1280x1024) Original or resized to.....?
  3. Should I save the entire image or just the ROI for the positive images?

I'd like to test this because as a first approach I took a picture of an object with my phone, cropped it and removed the background image description (50x50 grey scale image) and with opencv_createsamples plus the negatives that I took as described before (saved as grey scale 100x100).

Then to got my positive samples for training I run:

opencv_createsamples -img mouse1Resized.png -bg bg.txt -info pos/info.lst -jpgoutput pos/ -maxxangle 0.5 -maxyangle -0.5 -maxzangle 0.5 -num 1690 - bgcolor - -bgthresh 0

where 1690 is the number of negative images that I captured. Then I create the vec file with:

opencv_createsamples -info pos/info.lst -num 1690 -w 20 -h 20 -vec positives.vecInfo file name: pos/info.lst

And start training with:

opencv_traincascade -data data -vec positives.vec -bg bg.txt -numPos 1400 -numNeg 700 -numStages 15 -w 20 -h 20

When this finished, I tied the detector and I got a LOT of false positives, even when the object were not in the scene.

So here are some more questions.

  1. Should the negatives be 100x100?
  2. Should the positive be 50x50?
  3. When I create the .vec file, how large can -w and -h be?

I would like to test best approaches to see which gives the best results....or based on your experience, which one should I follow?

Thanks for the help.

EDIT 1:

This is the code I use for detections:

void detect(Mat frame, std::vector<Rect> &objects)
{
    int i, div=2;
    Mat frame_gray;
    resize(frame, frame_gray, Size(frame.cols/div,frame.rows/div));
    cvtColor(frame_gray, frame_gray, CV_BGR2GRAY);
    equalizeHist( frame_gray, frame_gray );
    detect_cascade.detectMultiScale(frame_gray, objects, 1.2, 2, 0|CV_HAAR_SCALE_IMAGE, Size(80, 80));

    for(i=0; i<objects.size(); i++)
    {
        objects[i].x = objects[i].x * div;    
        objects[i].y = objects[i].y * div;
        objects[i].width = objects[i].width * div;
        objects[i].height = objects[i].height * div;
    }
}