Enable Multithreading with TBB during cascade training

asked 2012-06-30 11:44:43 -0600

rantanplan
163 ●1 ●4 ●9

updated 2012-07-04 08:57:31 -0600

2792 ●13 ●25 ●52

I have compiled OpenCV-2.4.1 with TBB support (-DWITH_TBB=ON) on Ubuntu, which currently ships with TBB 4.0. However when I run opencv_traincascade to train a classifier only one core is used. I've also explicitly checked whether libtbb2 (really is 4.0) is used (/proc/[opencv_traincascade process number]/maps), and it is used. I am using an AMD Athlon 64 X2 Dual Core 4200+ processor.

Is there a way to enforce the usage of 2 or more cores? Or is there a problem with TBB on AMD CPUs?

Update: As I can't comment directly on your answers, so i update my question.

Thanks for your answers!
@Daniil: I tried checking for HAVE_TBB in traincascade.cpp without success. However, checking my build i found cvconfig.h which actually defines HAVE_TBB. So i believe that the libraries are really build with TBB support. I also checked their symbol table with nm and tbb symbols are there.
@Maria: Thats what i was afraid of. However this discussion states that there should be an improvement using TBB.?!

edit retag flag offensive close merge delete

Comments

Rantanplan, I'm an author of c++ cascade classifier in OpenCV including traincascade application. Improvement posted in the discussion can be explained by switching to use LBP features instead of Haar. LBPs are binary features in contrast with Haar and don't use float arithmetic. So they are about 3 times faster in detection. LBP-cascade is trained ~hour in comparison with ~day for Haar-cascade on the same data. traincascade was not tbb-optimized purposefully. It uses 1 optimization from MLL, + 1 minor optimization of feature precomputing. But it can give only slight improvement. Try to use LBP!

Maria Dimashova ( 2012-07-04 08:55:01 -0600 )edit

@Maria: Thanks for the clarification.

rantanplan ( 2012-07-04 14:47:42 -0600 )edit

add a comment

2 answers

Sort by » oldest newest most voted

answered 2012-07-04 06:33:30 -0600

Maria Dimashova
1377 ●9 ●14 ●26

Even with TBB you'll not see a sufficient workload of CPU cores by OpenCV traincascade application. Almost all the time only one core'll work. It's because only small part of the training code is parallelized by TBB: finding the best split of tree node and precomputing some part of feature values before the training a new stage. But significant time the traincascade is looking for negative samples that was recognized as positive (face) samples by all trained stages (trained part of a cascade) to train next new stage. This pick of samples is not parallelized.

edit flag offensive delete link

Comments

I am also in trouble with this :(((. Have you found the solution yet? Please tell me

Robin Hood ( 2014-03-13 22:02:43 -0600 )edit

This is annoying as hell. Is there an OpenCV bug report/ticket that I can follow in order to get updates on this?

Silex777 ( 2014-03-18 10:30:29 -0600 )edit

Is there no way to run traincascade on gpu yet?

muglikar ( 2014-09-07 08:22:22 -0600 )edit

http://iamsrijon.wordpress.com/2013/11/15/how-to-compile-opencv-to-utilize-multiple-core-processor-in-linux/

chebhou ( 2014-10-11 09:47:29 -0600 )edit

i dont think GPU is needed, negative sampling parallelization, at least with duplicates, would solve main problem.

Loknar ( 2015-12-12 13:14:20 -0600 )edit

Hello @MariaDimashova , is it right that during negative-stillPos-sample-search the bottleneck is HDD bandwidth while during actual training process the bottleneck is CPU and memory bandwidth? Atm I'm training haar (mode ALL) with about 45k positive samples and the training part is very slow even on the first stages. Looks like the problem is CPU load for me, while only about 30% of the CPU is used. Are there any plans to multi-thread the feature-selection part?

Micka ( 2016-11-28 03:17:28 -0600 )edit

add a comment

answered 2012-07-02 06:07:44 -0600

Daniil Osokin

2197 ●5 ●32

updated 2012-07-03 04:27:44 -0600

Hi, Rantanplan!

TBB is independent from processor, it can run on AMD (Intel FAQ).
I saw a problem with 2.0 TBB version (solution). Try to update TBB version to actual, if this isn't important.
To be completeness, are you set -DWITH_TBB=ON when run cmake? This turn on TBB support.
Also you can write in traincascade.cpp the section:

#ifdef HAVE_TBB

printf("TBB is used\n");

#endif

If printf executed, then TBB is ok.

edit flag offensive delete link

add a comment

Enable Multithreading with TBB during cascade training

Comments

2 answers

Comments

Links

Question Tools

Stats

Related questions

Enable Multithreading with TBB during cascade training edit

Comments

2 answers

Comments

Links

Question Tools

Stats

Related questions

Enable Multithreading with TBB during cascade training