cascade training best practices for lit sign

asked 2016-10-04 13:19:51 -0600

albesure79
1 ●1

updated 2016-10-05 05:52:31 -0600

StevenPuttemans

20029 ●16 ●82 ●207 http://stevenputtemans...

Was hoping to get some guidance on a few issues...

Here is the sign I would like to be able to recognize at night: http://67.media.tumblr.com/035fa2a4d9...

Here are my questions/issues:

To generate the images to be used for training I have used ffmpeg to create images from a video that I recorded. It created roughly 500 images, all from the left hand side of the street. I painstakingly annotated all 500 images only to have the training cease at stage 3. Should i not be using ffmpeg? As an alternative I could use the burst capability on the iphone, which will take a bunch of pictures. Should I be getting images from all angles? Should blurry images be omitted?

The negative images I used were also taken from a video using ffmpeg. The video is of the surrounding area, minus the sign of course.

I have been able to train a model successfully on a soda can (la croix) but for whatever reason I cannot get through the training for this type of object. Any help would be greatly appreciated

Here are my commands and their corresponding output:

opencv_createsamples -info annotations.txt -bg negatives.txt  -vec VeniceLeft.vec -w 73  -h 10
Info file name: annotations.txt
Img file name: (NULL)
Vec file name: VeniceLeft.vec
BG  file name: negatives.txt
Num: 1000
BG color: 0
BG threshold: 80
Invert: FALSE
Max intensity deviation: 40
Max x angle: 1.1
Max y angle: 1.1
Max z angle: 0.5
Show samples: FALSE
Original image will be scaled to:
    Width: $backgroundWidth / 73
    Height: $backgroundHeight / 10
Create training samples from images collection...
annotations.txt(553) : parse errorDone. Created 552 samples



opencv_traincascade -data cascade/ -vec VeniceLeft.vec -bg negatives.txt -numNeg 1000 -numPos 500 -minHitrate 0.995 -maxFalseAlarmRate 0.5  -mode ALL -precalcValBufSize 1024 -precalcIdxBufSize 1024  -w 73  -h 10
PARAMETERS:
cascadeDirName: cascade/
vecFileName: VeniceLeft.vec
bgFileName: negatives.txt
numPos: 500
numNeg: 1000
numStages: 20
precalcValBufSize[Mb] : 1024
precalcIdxBufSize[Mb] : 1024
acceptanceRatioBreakValue : -1
stageType: BOOST
featureType: HAAR
sampleWidth: 73
sampleHeight: 10
boostType: GAB
minHitRate: 0.995
maxFalseAlarmRate: 0.5
weightTrimRate: 0.95
maxDepth: 1
maxWeakCount: 100
mode: ALL
Number of unique features given windowSize [73,10] : 356522

===== TRAINING 0-stage =====
<BEGIN
POS count : consumed   500 : 500
NEG count : acceptanceRatio    1000 : 1
Precalculation time: 25
+----+---------+---------+
|  N |    HR   |    FA   |
+----+---------+---------+
|   1|    0.998|    0.001|
+----+---------+---------+
END>
Training until now has taken 0 days 0 hours 1 minutes 38 seconds.

===== TRAINING 1-stage =====
<BEGIN
POS count : consumed   500 : 501
NEG count : acceptanceRatio    1000 : 0.00319338
Precalculation time: 22
+----+---------+---------+
|  N |    HR   |    FA   |
+----+---------+---------+
|   1|    0.998|    0.007|
+----+---------+---------+
END>
Training until now has taken 0 days 0 hours 3 minutes 12 seconds.

===== TRAINING 2-stage =====
<BEGIN
POS count : consumed   500 : 502
NEG count : acceptanceRatio    1000 : 2.51281e-05
Precalculation time: 23
+----+---------+---------+
|  N |    HR   |    FA   |
+----+---------+---------+
|   1|        1|        1|
+----+---------+---------+
|   2|        1|    0.019|
+----+---------+---------+
END>
Training until now has taken 0 days 0 hours 10 minutes 3 seconds.

===== TRAINING 3-stage =====
<BEGIN
POS count : consumed   500 : 502
Train dataset for temp ...

(more)

edit retag flag offensive close merge delete

add a comment

answered 2016-10-05 05:59:56 -0600

StevenPuttemans

20029 ●16 ●82 ●207 http://stevenputtemans...

Lets get you some answers on your first attempts

Should I be getting images from all angles? Should blurry images be omitted?

Both questions can be answered as YES. For a cascade object detector you want as much variance in your training data as possible. Adding different angles can help, so does motion blur, ... this empowers the algorithm to look for features that are invariant to your position from which you take the image.

The negative images I used were also taken from a video using ffmpeg. The video is of the surrounding area, minus the sign of course.

Great! Keep in mind however that this will create a detector that will only run in that neighborhood.

... but for whatever reason I cannot get through the training for this type of object.

Can you explain this? As far as I see it, it the algorithm is just not able to get a new negative image. Can you confirm that each path in your negative.txt is actually correct? Because the problem seems to be there. Also you can check how your model is doing up till now. Retrain using -numStages 3 which will merge the previously trained stages into a model which you can use to see how you perform. Looking at your accuracy of 2.51281e-05 it might possible be already more then robust enough!

Comments

Really appreciate the detailed answers Steven! The opencv community would not be the same without you.

I have yet to perform the additional annotations. However, my good friend was able to create an excellent classifier that I tested last night. He was able to do so with just one image using the following commands:

opencv_createsamples -img sign.png -bg negatives.txt -info pos/info.lst -pngoutput pos -maxxangle 0.5 -maxyangle 0.5 -maxzangle 0.5 -num 500 -w 48 -h 24 opencv_createsamples -info pos/info.lst -w 48 -h 24 -vec VeniceLeft.vec -bg negatives.txt -num 500 opencv_traincascade -data cascade/ -vec VeniceLeft.vec -bg negatives.txt -numNeg 1000 -numPos 450 -w 48 -h 24

Do you happen to know what sort of objects this sort of training methodology might be suited for?

albesure79 ( 2016-10-06 11:16:48 -0600 )edit

Problem with using a single item is that you depend on the 2D planar projection of an object to be artificially deformed. In my experience, when running in real life industrial cases, this generates objects that are actually not there. In order to generate good samples, this utility requires you also to grab the objects in a clean background, so that you can use the transparancy property to merge them into backgrounds. This technique is usefull for every kind of object detection as long as it has local shape features that are unique enough.

StevenPuttemans ( 2016-10-07 02:21:49 -0600 )edit

add a comment

cascade training best practices for lit sign

1 answer

Comments

Links

Question Tools

Stats

Related questions

cascade training best practices for lit sign edit

1 answer

Comments

Links

Question Tools

Stats

Related questions

cascade training best practices for lit sign