Ask Your Question

Revision history [back]

Let us start by formulating some answers

  • I am currently trying to train my own cascade based on Naotoshi Seo's tutorial and Codin Robiin's tutorial. Tough many online tutorials reference to it, in my honest opinion, they are seriously lacking info, they use old interfaces, use plain wrong settings and they are far from up to date. Avoid them and use for more recent tutorials like the complete chapter in OpenCV 3 Blueprints on cascade classifier training, or the complete content of this forum, which is far better than those tutorials.
  • Yes grayscale is the way to go, tough if you supply colour images, the opencv_createsamples will render them grayscale and apply histogram equalization itself. So you can provide annotations on coloured images.
  • Negative images can be as large as you prefer. They basically get sampled by the positive sample size until it covers the whole image. They should however be equal or larger than -w x -h to avoid negatives being ignored.
  • You should save the entire positive image and create a postives.txt file containing annotations, which are basically bounding boxes of the objects.
  • removed the background Please don't ... in your application background will also be present! You need to make your model robust to background noice.
  • You used the tool to warp images ... read my chapter in OpenCV 3 Blueprints or search this forum to find out why you should absolutely NOT do that, but rather collect meaningful positive samples!
  • I got a LOT of false positives this means that you have yet a model that needs more training OR more training data AND surely more negatives to descriminate the background. Use bootstrapping, also described in the chapter to use your first model to improve a second one!
  • Negatives can be of any size, positives too
  • -w and -h can be as large as you want BUT know that how larger they are, the more features they will contain AND thus the more memory you will need to store them during training! On limited memory system try to get the sizes down as much as possible without loosing valuable edge info.

Again, go read the book! It is a collection of 2 years of my PhD experience on the interface. You might be surprised what a bunch of info it contains!

Let us start by formulating some answers

  • I am currently trying to train my own cascade based on Naotoshi Seo's tutorial and Codin Robiin's tutorial. Tough many online tutorials reference to it, in my honest opinion, they are seriously lacking info, they use old interfaces, use plain wrong settings and they are far from up to date. Avoid them and use for more recent tutorials like the complete chapter in OpenCV 3 Blueprints on cascade classifier training, or the complete content of this forum, which is far better than those tutorials.
  • Yes grayscale is the way to go, tough if you supply colour images, the opencv_createsamples will render them grayscale and apply histogram equalization itself. So you can provide annotations on coloured images.
  • Negative images can be as large as you prefer. They basically get sampled by the positive sample size until it covers the whole image. They should however be equal or larger than -w x -h to avoid negatives being ignored.
  • You should save the entire positive image and create a postives.txt file containing annotations, which are basically bounding boxes of the objects.
  • removed the background Please don't ... in your application background will also be present! You need to make your model robust to background noice.
  • You used the tool to warp images ... read my chapter in OpenCV 3 Blueprints or search this forum to find out why you should absolutely NOT do that, but rather collect meaningful positive samples!
  • I got a LOT of false positives this means that you have yet a model that needs more training OR more training data AND surely more negatives to descriminate the background. Use bootstrapping, also described in the chapter to use your first model to improve a second one!
  • Negatives can be of any size, positives too
  • -w and -h can be as large as you want BUT know that how larger they are, the more features they will contain AND thus the more memory you will need to store them during training! On limited memory system try to get the sizes down as much as possible without loosing valuable edge info.

Again, go read the book! It is a collection of 2 years of my PhD experience on the interface. You might be surprised what a bunch of info it contains!

UPDATED ANSWER

Therefore, as I can take 50 images of my object from different angles and use them as positives. Would this be better?

I am of that opinion yes. However do accept that a cascade classifier is not completely viewpoint invariant, so if the viewpoint changes to drastically you might need 2 models. This can for example be seen in the face model, where we have a seperate model for frontal faces and for profile faces. However a face that is rotated 15 degrees from a profile face will still be detected by the profile face model and vice versa for the frontal face.

For the negatives, I can do a 'random' walk gathering images without the object, right?

YES exactly! However then do take to account that the detector will only work decently in that environment, because it is impossible for the learning to cope with variance in unknown backgrounds.

I also read that I should aim for NEG count : acceptanceRatio around 0.0004 to consider a good cascade and if it is ~5.3557e-05 over trained?

I guess it depends. I have read about this topic for ages now, and it seems that people tend to believe that 10^-5 is about the tipping point between efficiency increase and overfitting. However I will never stick out my hand and say that it is the golden rule. It works for my applications but basically you should calculate the precision and recall values after each stage and see if it increases or decreases in efficiency on your test set.

Let us start by formulating some answers

  • I am currently trying to train my own cascade based on Naotoshi Seo's tutorial and Codin Robiin's tutorial. Tough many online tutorials reference to it, in my honest opinion, they are seriously lacking info, they use old interfaces, use plain wrong settings and they are far from up to date. Avoid them and use for more recent tutorials like the complete chapter in OpenCV 3 Blueprints on cascade classifier training, or the complete content of this forum, which is far better than those tutorials.
  • Yes grayscale is the way to go, tough if you supply colour images, the opencv_createsamples will render them grayscale and apply histogram equalization itself. So you can provide annotations on coloured images.
  • Negative images can be as large as you prefer. They basically get sampled by the positive sample size until it covers the whole image. They should however be equal or larger than -w x -h to avoid negatives being ignored.
  • You should save the entire positive image and create a postives.txt file containing annotations, which are basically bounding boxes of the objects.
  • removed the background Please don't ... in your application background will also be present! You need to make your model robust to background noice.
  • You used the tool to warp images ... read my chapter in OpenCV 3 Blueprints or search this forum to find out why you should absolutely NOT do that, but rather collect meaningful positive samples!
  • I got a LOT of false positives this means that you have yet a model that needs more training OR more training data AND surely more negatives to descriminate the background. Use bootstrapping, also described in the chapter to use your first model to improve a second one!
  • Negatives can be of any size, positives too
  • -w and -h can be as large as you want BUT know that how larger they are, the more features they will contain AND thus the more memory you will need to store them during training! On limited memory system try to get the sizes down as much as possible without loosing valuable edge info.

Again, go read the book! It is a collection of 2 years of my PhD experience on the interface. You might be surprised what a bunch of info it contains!

UPDATED ANSWER

Therefore, as I can take 50 images of my object from different angles and use them as positives. Would this be better?

I am of that opinion yes. However do accept that a cascade classifier is not completely viewpoint invariant, so if the viewpoint changes to drastically you might need 2 models. This can for example be seen in the face model, where we have a seperate model for frontal faces and for profile faces. However a face that is rotated 15 degrees from a profile face will still be detected by the profile face model and vice versa for the frontal face.

For the negatives, I can do a 'random' walk gathering images without the object, right?

YES exactly! However then do take to account that the detector will only work decently in that environment, because it is impossible for the learning to cope with variance in unknown backgrounds.

I also read that I should aim for NEG count : acceptanceRatio around 0.0004 to consider a good cascade and if it is ~5.3557e-05 over trained?

I guess it depends. I have read about this topic for ages now, and it seems that people tend to believe that 10^-5 is about the tipping point between efficiency increase and overfitting. However I will never stick out my hand and say that it is the golden rule. It works for my applications but basically you should calculate the precision and recall values after each stage and see if it increases or decreases in efficiency on your test set.

The computer I'm using has 32gb of ram plus another 32gb of swap so, should I go with RAB?

Actually that is the one parameter I have never changed before, but again, go ahead and see if it improves the accuracy of your detector!