Random forest - how to force parallel training?

asked 2017-08-22 10:27:31 -0500

Hi,

In the version 2.4 documentation (http://docs.opencv.org/2.4/modules/ml... ) for the CvRTrees::train() method it is stated that TBB is used for (multicore) acceleration. Has this been dropped? I am using v3.1.0 and although I built opencv using WITH_TBB=ON (and OPENMP support, juts in case) and and I link against libtbb the train method still runs on a single core. Note that I am not using cmake for my build: I explicitly link against opencv_core and opencv_ml and the code does what it should with these. I also added libtbb, as noted above, but to no avail.

I have checked libopencv_core.so and libopencv_ml.so - libtbb is referenced in both so's. I have the current up-to -date version of TBB (installed via apt).

I had a look at the random forest train method and I see no reference to any kind of parallelisation structure (e.g. parallel_for) ...but why would this be removed - RF is embarrassingly parallel and it seems a perfect candidate.

If there is something I am missing, please help...I really can't do with a single core only for my huge data sets.

edit retag flag offensive close merge delete

Comments

old 2.4 version had this , but no such thing in current 3.3.

berak gravatar imageberak ( 2017-08-23 01:07:37 -0500 )edit

I was afraid that was the case :-/ It also seems that the parallelisation they had before was only for the best split determination and not to build each tree in parallel (although split computation is probably the most expensive part). I wonder why it was removed. I'm very disappointed.

patrickmarais gravatar imagepatrickmarais ( 2017-08-23 04:24:20 -0500 )edit

Instead of being disappointed, put on your working shoes, grab the code from 2.4, force it on the 3.3 branch, fix possible issues and supply a PR. It will make your life and that of many others a lot better :)

StevenPuttemans gravatar imageStevenPuttemans ( 2017-08-23 09:42:19 -0500 )edit