Random forest exact split function

asked 2013-09-24 10:25:31 -0500

-coffee- gravatar image

Hi, all!

I recently started working with openCV RF implementation. Is there any documentation describing exactly what happends inside each node during training? I could not find it in the documentation, so what is figured out from the code is:

  1. a subset of data samples is selected randomly for each tree

  2. training done exactly in the same way as for a decision tree (I might have missed something here)

Training for a decision tree (for continous variables):

  1. for each node iterate through all variables (?) and compute a split threashold by spliting sorted data in two parts (threashold determines the parts)

  2. take a split with the best quality

So, it is quite far from the scheme, proposed from microsoft and in a way much less randomized. Has anybody else tried to find out how the forest actually works?

edit retag flag offensive close merge delete