Ask Your Question
0

Does opencv_traincascades give consistent results over time?

asked 2015-05-10 04:19:45 -0600

abhishek gravatar image

Every time I run opencv_traincascades and then run detectMultiScale, in the results I get different number and quality of detected windows in test samples. I have tried on different machines and the results change almost every time. Is running opencv_traincascades not a deterministic process?

edit retag flag offensive close merge delete

Comments

1

Strange, could you specify what you mean with every time? I mean if I take a set of data samples, a set of fixed parameters, then every round of training ends up with the exact same model. The process is sequential, even the pulling of the negative sample windows, so there is no part that makes it random/non-deterministic. Are you sure you are not changing any parameters?

StevenPuttemans gravatar imageStevenPuttemans ( 2015-05-11 05:56:43 -0600 )edit

Yes, exactly same parameters, and samples every time. As far as I understood the process should have been non random, so I'm surprised myself.

abhishek gravatar imageabhishek ( 2015-05-11 13:36:13 -0600 )edit

Did you interrupt the training process and then resumed it?

Gino Strato gravatar imageGino Strato ( 2015-05-18 02:30:52 -0600 )edit

1 answer

Sort by » oldest newest most voted
1

answered 2015-05-11 16:18:09 -0600

Gino Strato gravatar image

updated 2015-05-13 03:24:29 -0600

Have you ever heard about chaos theory and deterministic chaos?
Well I think this an interesting case study, even though I dare everyone to find equations for it.
The training algorithm is multithread when it comes to finding the best split of the decision tree.
Inside function:

CvDTreeSplit CvDTree::find_best_split( CvDTreeNode* node )

there is a call to cv::parallel_reduce, based on TBB.
As far as I know, the collecting phase of negatives is single-threaded, instead.
My hypothesis is that probably the parallel mechanism has been poorly designed with minor differences occurring at every run of the training algorithm, and those negligible differences eventually magnify, stage after stage.
Even the detection algorithm is not predictable, but those minor differences remain minimal.

EDIT

I myself tried to reproduce the behaviour I observed same time ago (non-deterministic results during training), but I was not able to reproduce those conditions. Too much time has gone since then (now I use a really random mechanism to extract negatives).
Anyway I reproduced a non-deterministic outcome for detection that I observed more recently. I’ve a program for automatic testing of a batch of about 500 images that records results. There no difference over many runs of it for the hit-rate and the false positive rate, but the average width and position of the detected rectangles is very slightly shifted. This is sufficient to say that the detection is not deterministic in those conditions (including the phase of rectangles gathering).
My hypothesis was that this was linked to non-determinism in the training phase, and that those slight difference magnified over time, but as I cannot reproduce it anymore, I cannot add more or say it for sure.
Some time ago a thread was opened by a user who said that he observed a random behaviour even though he used a precompiled .vec file of negatives.
Anyway, a different source of randomness could be the interruption and resuming of the training process.
The algorithm doesn’t record the last offset and it starts by collecting negative windows from a different position (I reproduced it yesterday and in fact it is non-deterministic).
An alternative explanation is that the lack or defectiveness of just one image on different machines is enough to change the results.

edit flag offensive delete link more

Comments

1

But then again I am wondering the following. I got a OpenCV 2.4 latest branch system built here, with TBB enabled and 32 cores available. I am using an object case of about 1500 positives and 5000 negative windows at each stage. I now trained three models, three times with exactly the same parameters, and when I do a file comparison, the end result is that the final models are identical... @abhishek can you elaborate on the OpenCV version you use, the system configuration, ... ? I would love to dig deeper into this matter.

StevenPuttemans gravatar imageStevenPuttemans ( 2015-05-12 02:26:15 -0600 )edit

Hey, guys. Thanks for your inputs. As for TBB, I have a vanilla Ubuntu 14.04 opencv installation (2.4.8), and hence TBB would not be used. I'll try to post an example somewhere soon. Please let me know if I should mention any other config details.

abhishek gravatar imageabhishek ( 2015-05-12 10:17:15 -0600 )edit
1

@abhishek I will be traveling for the next 12 days for work and thus will lack to time to thouroughly debug your input but afterwards I will get back to you! As to your configuration and setup, try to provide as much info as possible, will do the same myself with some text case.

StevenPuttemans gravatar imageStevenPuttemans ( 2015-05-13 01:59:11 -0600 )edit

Question Tools

1 follower

Stats

Asked: 2015-05-10 04:19:45 -0600

Seen: 332 times

Last updated: May 13 '15