Have you ever heard about chaos theory and deterministic chaos?
Well I think this an interesting case study, even though I dare everyone to find equations for it.
The training algorithm is multithread when it comes to finding the best split of the decision tree.
Inside function:
CvDTreeSplit CvDTree::find_best_split( CvDTreeNode* node )
there is a call to cv::parallel_reduce, based on TBB.
As far as I know, the collecting phase of negatives is single-threaded, instead.
My hypothesis is that probably the parallel mechanism has been poorly designed with minor differences occurring at every run of the training algorithm, and those negligible differences eventually magnify, stage after stage.
Even the detection algorithm is not predictable, but those minor differences remain minimal.
EDIT
I myself tried to reproduce the behaviour I observed same time ago (non-deterministic results during training), but I was not able to reproduce those conditions. Too much time has gone since then (now I use a really random mechanism to extract negatives).
Anyway I reproduced a non-deterministic outcome for detection that I observed more recently.
I’ve a program for automatic testing of a batch of about 500 images that records results.
There no difference over many runs of it for the hit-rate and the false positive rate, but the average width and position of the detected rectangles is very slightly shifted. This is sufficient to say that the detection is not deterministic in those conditions (including the phase of rectangles gathering).
My hypothesis was that this was linked to non-determinism in the training phase, and that those slight difference magnified over time, but as I cannot reproduce it anymore, I cannot add more or say it for sure.
Some time ago a thread was opened by a user who said that he observed a random behaviour even though he used a precompiled .vec file of negatives.
Anyway, a different source of randomness could be the interruption and resuming of the training process.
The algorithm doesn’t record the last offset and it starts by collecting negative windows from a different position (I reproduced it yesterday and in fact it is non-deterministic).
An alternative explanation is that the lack or defectiveness of just one image on different machines is enough to change the results.
Strange, could you specify what you mean with every time? I mean if I take a set of data samples, a set of fixed parameters, then every round of training ends up with the exact same model. The process is sequential, even the pulling of the negative sample windows, so there is no part that makes it random/non-deterministic. Are you sure you are not changing any parameters?
Yes, exactly same parameters, and samples every time. As far as I understood the process should have been non random, so I'm surprised myself.
Did you interrupt the training process and then resumed it?