Opencv_traincascade training too fast?

asked 2018-04-18 13:39:16 -0600

uckizz
1 ●1

Hi!

Im creating my own carwheel-detection cascade as a fun project. My attempt so far is based on different tutorials, and I've described the whole process below:

Positive data: 40 images of a carwheel, cropped from photos taken of cars, downsized to 50x50 png (approx. 7kb size each). Negative data: 600 random outdoor photos not containing cars or wheels. Resized to 500x500 jpg (approx. 100kb each)

Used Naotoshi Seo's perlscript to generate 1500 positive samples (same settings except -w and -h set to 50x50).

Used his mergevector.py script to merge all the .vec files generated.

Used the same trainingparameters with opencv_traincascade, except with LBP, and -w and -h parameters set to 50x50.

Well, training is super fast (A couple of mins for 20 stages), and when I tested, it detected a lot of false positives. I suspect somethings wrong with the data, or that I can tweak some parameters/settings.

Does anyone have any ideas or tips on what parameters/settings/datatweaks I can use for better performance?

Thanks!

//Nick

edit retag flag offensive close merge delete

Comments

"Positive data: 40 images of a carwheel," -- come back with 10X or even 100x of th that

berak ( 2018-04-18 13:46:07 -0600 )edit

I thought the perlscript generated more samples? I followed this tutorial: http://coding-robin.de/2013/07/22/tra..., he uses 40 images and then uses the script to generate 1500 samples.

uckizz ( 2018-04-18 13:48:12 -0600 )edit

forget the perl script (or any silly attempt at generating synthetic data from single images)

berak ( 2018-04-18 13:50:33 -0600 )edit

add a comment

answered 2018-04-19 03:37:10 -0600

StevenPuttemans

20029 ●16 ●82 ●207 http://stevenputtemans...

updated 2018-04-19 03:39:32 -0600

Ah I was waiting when this would come back. If you want a more detailed background, go for chapter 5 in OpenCV 3 Blueprints, but here are some pointers.

Like stated by @berak, forget the perl script generation of artificial samples. It simply does not hold and creates bad classifiers. Go for pure real samples. Better 50 real samples than a 1000 artificial ones.
Then you don't need the mergevec either, which tends to cause issues for alot of people
A fast training means that your seperation between positive and negative samples is easy. Probably it only needs a couple of weak classifiers to have a succesful seperation. Increasing complexity and thus training time, can be done with adding more training data, setting your settings more strict, ... even increasing resolution can help.
False positives means that your detector still does not know exactly what a negative sample is, hence it needs more negative data. Try negative bootstrapping: use your initial detector, collect false positives, feed those as hard negatives.

The Q&A litterally has 1000 questions on this, you would be amazed how much details you can find here.

edit flag offensive delete link

add a comment

Opencv_traincascade training too fast?

Comments

1 answer

Links

Question Tools

Stats

Related questions

Opencv_traincascade training too fast? edit

Comments

1 answer

Links

Question Tools

Stats

Related questions

Opencv_traincascade training too fast?