1 | initial version |
Let us start by formulating some answers
opencv_createsamples
will render them grayscale and apply histogram equalization itself. So you can provide annotations on coloured images.-w x -h
to avoid negatives being ignored.-w
and -h
can be as large as you want BUT know that how larger they are, the more features they will contain AND thus the more memory you will need to store them during training! On limited memory system try to get the sizes down as much as possible without loosing valuable edge info.Again, go read the book! It is a collection of 2 years of my PhD experience on the interface. You might be surprised what a bunch of info it contains!
2 | No.2 Revision |
Let us start by formulating some answers
opencv_createsamples
will render them grayscale and apply histogram equalization itself. So you can provide annotations on coloured images.-w x -h
to avoid negatives being ignored.-w
and -h
can be as large as you want BUT know that how larger they are, the more features they will contain AND thus the more memory you will need to store them during training! On limited memory system try to get the sizes down as much as possible without loosing valuable edge info.Again, go read the book! It is a collection of 2 years of my PhD experience on the interface. You might be surprised what a bunch of info it contains!
UPDATED ANSWER
Therefore, as I can take 50 images of my object from different angles and use them as positives. Would this be better?
I am of that opinion yes. However do accept that a cascade classifier is not completely viewpoint invariant, so if the viewpoint changes to drastically you might need 2 models. This can for example be seen in the face model, where we have a seperate model for frontal faces and for profile faces. However a face that is rotated 15 degrees from a profile face will still be detected by the profile face model and vice versa for the frontal face.
For the negatives, I can do a 'random' walk gathering images without the object, right?
YES exactly! However then do take to account that the detector will only work decently in that environment, because it is impossible for the learning to cope with variance in unknown backgrounds.
I also read that I should aim for NEG count : acceptanceRatio around 0.0004 to consider a good cascade and if it is ~5.3557e-05 over trained?
I guess it depends. I have read about this topic for ages now, and it seems that people tend to believe that 10^-5 is about the tipping point between efficiency increase and overfitting. However I will never stick out my hand and say that it is the golden rule. It works for my applications but basically you should calculate the precision and recall values after each stage and see if it increases or decreases in efficiency on your test set.
3 | No.3 Revision |
Let us start by formulating some answers
opencv_createsamples
will render them grayscale and apply histogram equalization itself. So you can provide annotations on coloured images.-w x -h
to avoid negatives being ignored.-w
and -h
can be as large as you want BUT know that how larger they are, the more features they will contain AND thus the more memory you will need to store them during training! On limited memory system try to get the sizes down as much as possible without loosing valuable edge info.Again, go read the book! It is a collection of 2 years of my PhD experience on the interface. You might be surprised what a bunch of info it contains!
UPDATED ANSWER
Therefore, as I can take 50 images of my object from different angles and use them as positives. Would this be better?
I am of that opinion yes. However do accept that a cascade classifier is not completely viewpoint invariant, so if the viewpoint changes to drastically you might need 2 models. This can for example be seen in the face model, where we have a seperate model for frontal faces and for profile faces. However a face that is rotated 15 degrees from a profile face will still be detected by the profile face model and vice versa for the frontal face.
For the negatives, I can do a 'random' walk gathering images without the object, right?
YES exactly! However then do take to account that the detector will only work decently in that environment, because it is impossible for the learning to cope with variance in unknown backgrounds.
I also read that I should aim for NEG count : acceptanceRatio around 0.0004 to consider a good cascade and if it is ~5.3557e-05 over trained?
I guess it depends. I have read about this topic for ages now, and it seems that people tend to believe that 10^-5 is about the tipping point between efficiency increase and overfitting. However I will never stick out my hand and say that it is the golden rule. It works for my applications but basically you should calculate the precision and recall values after each stage and see if it increases or decreases in efficiency on your test set.
The computer I'm using has 32gb of ram plus another 32gb of swap so, should I go with RAB?
Actually that is the one parameter I have never changed before, but again, go ahead and see if it improves the accuracy of your detector!