method of creating positive images
Hi there,
I am creating traincascade file for detecting the bird tail, after reading opencv manual, I created positive and negative images as below.
I generate about 4500 negative images from a video, all images have the same size ~ 1080x1920, then I converted them into 680x480 size by using image size converter.
I cropped the positive object manually, and I have got 1052 images that have the size 230x433, 133x522, 233x355... those sizes are quite different, then I use a converting image size to make all positive images size to be 92x112
after that I created positives.txt, negagives.txt files and vector file, then run opencv_traincascade to make training file, by now, it have taken 4 days but the traincascade is in stage 3.
Am I wrong on creating positive and negative images? is it Ok to use an image converter for image size like ALSee?
Any feedback? Thank you in advance.
Tim.
I'm not surprised it is taking long, it makes sense since the size of your detection window (92x112) is quite big and you are adding a lot of negative samples. Traincascade will collect a number of negative samples in each stage from your negative images. As the stages advance, the cascade will require more and more negative samples to meet your requiresments and will take a long time.
If you are requiring a great hitRatio and low falseAlarmRate it may be the case that the traincascade is to able to advance. Please do provide further details of your training parameters.
Thank Batista,
You are right, I have used too much negative mages. I have changed the detection window size to 20x20, then speed have improved, it took 10 minutes for 20 stages for 360 postiive images and 900 negative images
The parameters:
opencv_traincascade -data caswork
-vec veccaswork.vec
-bg negcaswork.dat
-numPos 360
-numNeg 900
-numStages 20
-precalcValBufSize 2048
-precalcIdxBufSize 2048
-baseFormatSave
-minHitRate 0.999
-maxFalseAlarmRate 0.5
-w 20 -h 20
-mode ALL
Is it possible to detect object size 92x112 if I use 20x20 size for window detection?
Yes, it is, because at detection time the image will be downscaled multiple times and searched at many different scales. This means your object will shrink down to small sizes even if it occupies the whole imiage.
Face detectors are usually trained with 20x20 detection window, but it detects faces if they are close to the camera :)
Thank Batista, I understand, I know that you have read face detection a lot and you absolutely get my idea when I mention about the positive image size 92x112. it's famous size (lol).
recently, I have written a program that read the live camera, then I got the video file, later, I extract all necessary positive image and negative background images that I want to test in order to train cascade. but my program that use traincascade file is not detect well, its incorrect, I am in doubt about the way of creating pos & neg images for cascade traning. I meant that, all step for creating traincascade have been done, I am doing object detection in live camera but the result for object detection is incorrect.
@Pedro Batista , give me some advice
I think 360 positives and 900 negatives is not enough, try something like 1000 positives and 2000 negatives. This will improve for sure the accuracy of your cascade, but keep in mind that false positives and false negatives are normal and will always exist.
Your false alarm rate also seems pretty high. Setting it at 0.5 means that you are allowing your cascade to make a lot of false detections in each stage. Try setting it to the lowest value possible, knowing that very low values are also unrealistic. The same can be said about the minHitRate, which in your scenario is 0.999, which is being too greedy, something like 0.9 makes more sense.
But the point is to keep changing these parameters until you find the optimal values that will also allow you to complete training.
Good luck!