Enable Multithreading with TBB during cascade training
I have compiled OpenCV-2.4.1 with TBB support (-DWITH_TBB=ON) on Ubuntu, which currently ships with TBB 4.0. However when I run opencv_traincascade to train a classifier only one core is used. I've also explicitly checked whether libtbb2 (really is 4.0) is used (/proc/[opencv_traincascade process number]/maps), and it is used. I am using an AMD Athlon 64 X2 Dual Core 4200+ processor.
Is there a way to enforce the usage of 2 or more cores? Or is there a problem with TBB on AMD CPUs?
Update: As I can't comment directly on your answers, so i update my question.
- Thanks for your answers!
- @Daniil: I tried checking for HAVE_TBB in traincascade.cpp without success. However, checking my build i found cvconfig.h which actually defines HAVE_TBB. So i believe that the libraries are really build with TBB support. I also checked their symbol table with
nm
and tbb symbols are there. - @Maria: Thats what i was afraid of. However this discussion states that there should be an improvement using TBB.?!
Rantanplan, I'm an author of c++ cascade classifier in OpenCV including traincascade application. Improvement posted in the discussion can be explained by switching to use LBP features instead of Haar. LBPs are binary features in contrast with Haar and don't use float arithmetic. So they are about 3 times faster in detection. LBP-cascade is trained ~hour in comparison with ~day for Haar-cascade on the same data. traincascade was not tbb-optimized purposefully. It uses 1 optimization from MLL, + 1 minor optimization of feature precomputing. But it can give only slight improvement. Try to use LBP!
@Maria: Thanks for the clarification.