# traincascade : openMP vs Intel's TBB ?

I have an Intel i5 processor with 8gb RAM. Ubuntu 14.04. I am working on cascade training with LBP on openCV 2.4.9. Training takes hell lot of time. After waiting for a week to train cascade, its really painful to see it not working correctly and figuring out that it needs to be trained on more samples.

I tried installing opencv with TBB (thread building block) with no notable advantage in training. What else can I do for making it more time efficient. ?

I found a link https://iamsrijon.wordpress.com/2013/... demonstrating the use of openMP. Is openMP better than TBB ? Any tutorial for reference. Any help would really be very helpul.

edit retag close merge delete

Sort by » oldest newest most voted

Okay lets wrap some things up. The biggest misunderstanding is that the traincascade algorithm is actually paralizable. Large parts of the boosting process are simply single core sequential code. Only the grabbing of all the features and calculating the corresponding weak classifiers can be done multithreaded.

This is where the new concurrency API comes into play, it actually selects the backend on your system, either TBB or OpenMP but it should produce the same performance results.

If your training takes long it can be one of the following reasons

• Your training data is complex, finding weak classifiers that can seperate your data with respect to your performance accuracy designed is difficult.
• Your training data is to big, larger resolutions need more memory, so if you do not increase the calculation buffers, then it will slow down the training process significantly. Try upgrading the -precalcValBufSize and -precalcIdxBufSize. With 8 GB of memory I would certainly put them on 2048MB each!
more

Here it is: opencv_traincascade -data data -vec object.vec -bg bg.txt -numPos 350 -numNeg 500 -numStages 10 -w 100 -h 40 -featureType LBP -maxFalseAlarmRate 0.3 -minHitRate 0.95 -precalcValBufSize 2048 -precalcIdxBufSize 2048

( 2015-03-20 05:52:10 -0500 )edit

Why -maxFalseAlarmRate 0.3, it is one of the reasons that your training takes longer. Standard 0.5 ensures that each weak classifier does a bit better than random guessing. Keep in mind that using -w 100 - h 40 will yield ALOT of features, since 24x24 pixels already yields about 35000 features and it grows exponentially. Why not reduce your size to -w 50 -h 20 and making yourself able to detect smaller objects and reduce training time?

( 2015-03-20 06:57:01 -0500 )edit

I tried using openMP with openCV. It helped in utilizing all the cores of my processor for training purposes unlike the previous case when only a single core was involved. As expected this made the training process became faster. https://iamsrijon.wordpress.com/2013/...

All the cores are being used for training process

more

Which object do you want to detect ?

If it takes so much time, I think the problem is in the type of images you use for the positive / negative list (maybe too much, maybe too similar).

You can compare the log with this one also : https://stackoverflow.com/questions/16058080/how-to-train-cascade-properly

more

1

I have 500 negative images and 425 positive images (all used for .vec file) of which 350 images are used for training. Besides, these images are taken from a depository specially maintained for training cascade purposes.

I am trying to figure out: -> is openMP supported by openCV-2.4.9 / -> does it serve better then TBB ?

Initially only a core got involved for training. After extensive research, I used TBB. All cores took up the task of training but only for first 3 stages, post which only one core was involved. Suprisingly, after killing the process midway, deleting the files generated and restarting the training process midway, core repeated exact pattern. Dont really know if has some significance.

( 2015-03-18 07:03:08 -0500 )edit

What is the resolution of the positive and negative images ?

In my opinion, choosing between OpenMP or TBB will not solve the problem because as far as I know, only a small part of the algorithm is parallelized (http://answers.opencv.org/question/63/enable-multithreading-with-tbb-during-cascade-training/?answer=95#post-id-95).

My guess is that the big amount of time is due to something relevant to the image dataset.

( 2015-03-18 11:53:18 -0500 )edit
2

All the images (positive and negative) have resoluiton of 100 X 40. Its been 48 hours since the training process has started and its still on 4th stage. How much time should it ideally take for these samples with given resolution.

( 2015-03-19 00:05:42 -0500 )edit
2

I trained a cascade classifier using LBP features and it lasts maybe 48 hours or a little less (on a comparable machine), I don't remember well.

Some parameters: nb positive images=1200 ; nb negative images=775 ; resolution of positive images=64x64 ; resolution of negative images=640x480 for most of the images.

The processing time will depend of course of the number and resolution images. The other important parameter I think is in the constitution of the positive / negative images. I need to learn more about cascade classification but something like if the view is too similar in the negative images, it will be harder to extract relevant features that classify correctly I think.

( 2015-03-19 11:56:40 -0500 )edit
2

Even i use LBP for training. Maybe I should play with my image database for better results with more number of samples.

( 2015-03-20 01:24:27 -0500 )edit

Official site

GitHub

Wiki

Documentation