Ask Your Question

Revision history [back]

click to hide/show revision 1
initial version

TBB parallel_for vs std::thread

Hi,

I'm starting with parallel processing in OpenCV an wonder why I should use parallel_for (from TBB) instead of just using multiple std::threads. As I understood the parallel_for functionality, you have to create a class extending cv::ParallelLoopBody with a method signature of void operator()(const cv::Range& range). This is where the processing then happens. But you cannot pass any arguments to this function, nor can you parametrise your parallel function in any way. All you have is this range of your thread, and the arguments you passed to the cv::ParallelLoopBody instance, which are the same for each thread. So you'd have to sort out your arguments with that range, e.g. passing a vector of Images to the cv::ParallelLoopBody instance and then using the range to extract the one you need. You'd have to do so for every single parameter that is thread-dependend.

So what's the benefit then compared to threads? I can bind any arbitrary function with (almost) arbitrary parameters with boost or C++11. For this purpose I wrote a very primitive thread pool manager. Anything wrong with that?

cheers, stfn

P.S. I'm not an threading expert. I know there are memory access concerns when the functions I'm threading are using the same memory. Reading is not the problem, but when two function write simultaniously on the same Mat what is happening despite probable corrupted data? Is the caching triggered, forcing the data to be up to date before writing? More generally: what do I need to take care of?

TBB parallel_for vs std::thread

Hi,

I'm starting with parallel processing in OpenCV an wonder why I should use parallel_for (from TBB) instead of just using multiple std::threads. As I understood the parallel_for functionality, you have to create a class extending cv::ParallelLoopBody with a method signature of void operator()(const cv::Range& range). This is where the processing then happens. But you cannot pass any arguments to this function, nor can you parametrise your parallel function in any way. All you have is this range of your thread, and the arguments you passed to the cv::ParallelLoopBody instance, which are the same for each thread. So you'd have to sort out your arguments with that range, e.g. passing a vector of Images to the cv::ParallelLoopBody instance and then using the range to extract the one you need. You'd have to do so for every single parameter that is thread-dependend.

So what's the benefit then compared to threads? I can bind any arbitrary function with (almost) arbitrary parameters with boost or C++11. For this purpose I wrote a very primitive thread pool manager. Anything wrong with that?

cheers, stfn

P.S. I'm not an threading expert. I know there are memory access concerns when the functions I'm threading are using the same memory. memory for writing. Reading is not the problem, but when two function write simultaniously on the same Mat what is happening despite probable corrupted data? data due to race conditions? Is the caching triggered, forcing the data to be up to date before writing? More generally: what do I need to take care of?of in terms of performance and data safety?

TBB parallel_for vs std::thread

Hi,

I'm starting with parallel processing in OpenCV an wonder why I should use parallel_for (from TBB) instead of just using multiple std::threads. As I understood the parallel_for functionality, you have to create a class extending cv::ParallelLoopBody with a method signature of void operator()(const cv::Range& range). This is where the processing then happens. But you cannot pass any arguments to this function, nor can you parametrise your parallel function in any way. All you have is this range of your thread, and the arguments you passed to the cv::ParallelLoopBody instance, which are the same for each thread. So you'd have to sort out your arguments with that range, e.g. passing a vector of Images to the cv::ParallelLoopBody instance and then using the range to extract the one you need. You'd have to do so for every single parameter that is thread-dependend.

So what's the benefit then compared to threads? I can bind any arbitrary function with (almost) arbitrary parameters with boost or C++11. C++11, without creating new classes for each task to be parallized. For this purpose I wrote a very primitive thread pool manager. Anything wrong with that?

cheers, stfn

P.S. I'm not an threading expert. I know there are memory access concerns when the functions I'm threading are using the same memory for writing. Reading is not the problem, but when two function write simultaniously on the same Mat what is happening despite probable corrupted data due to race conditions? Is the caching triggered, forcing the data to be up to date before writing? More generally: what do I need to take care of in terms of performance and data safety?safety? Are those pitfalls already taken care of in TBB and this is why it is used?

TBB parallel_for vs std::thread

Hi,

I'm starting with parallel processing in OpenCV an wonder why I should use parallel_for (from TBB) instead of just using multiple std::threads. As I understood the parallel_for functionality, you have to create a class extending cv::ParallelLoopBody with a method signature of void operator()(const cv::Range& range). This is where the processing then happens. But you cannot pass any arguments to this function, nor can you parametrise your parallel function in any way. All you have is this range of your thread, and the arguments you passed to the cv::ParallelLoopBody instance, which are the same for each thread. So you'd have to sort out your arguments with that range, e.g. passing a vector of Images to the cv::ParallelLoopBody instance and then using the range to extract the one you need. You'd have to do so for every single parameter that is thread-dependend.

So what's the benefit then compared to threads? I can bind any arbitrary function with (almost) arbitrary parameters with boost or C++11, without creating new classes for each task to be parallized. For this purpose I wrote a very primitive thread pool manager. Anything wrong with that?

cheers, stfn

P.S. I'm not an threading expert. I know there are memory access concerns when the functions I'm threading are using the same memory for writing. Reading is not the problem, but when two function write simultaniously e.g. on the same Mat Mat, what is happening despite probable corrupted data due to race conditions? Is the caching triggered, forcing the data to be up to date before writing? More generally: what do I need to take care of in terms of performance and data safety? Are those pitfalls already taken care of in TBB and this is why it is used?used in OpenCV?

TBB parallel_for vs std::thread

Hi,

I'm starting with parallel processing in OpenCV an wonder why I should use parallel_for (from TBB) instead of just using multiple std::threads. As I understood the parallel_for functionality, you have to create a class extending cv::ParallelLoopBody with a method signature of void operator()(const cv::Range& range). This is where the processing then happens. But you cannot pass any arguments to this function, nor can you parametrise your parallel function in any way. All you have is this range of your thread, and the arguments you passed to the cv::ParallelLoopBody instance, which are the same for each thread. So you'd have to sort out your arguments with that range, e.g. passing a vector of Images to the cv::ParallelLoopBody instance and then using the range to extract the one you need. You'd have to do so for every single parameter that is thread-dependend.

So what's the benefit then compared to threads? I can bind any arbitrary function with (almost) arbitrary parameters with boost or C++11, without creating new classes for each task to be parallized. For this purpose I wrote a very primitive thread pool manager. Anything wrong with that?

cheers, stfnstfn

P.S. I'm not an threading expert. I know there are memory access concerns when the functions I'm threading are using the same memory for writing. Reading is not the problem, but when two function write simultaniously e.g. on the same Mat, what is happening despite probable corrupted data due to race conditions? Is caching triggered, forcing the data to be up to date before writing? More generally: what do I need to take care of in terms of performance and data safety? Are those pitfalls already taken care of in TBB and this is why it is used in OpenCV?

TBB parallel_for vs std::thread

Hi,Hi,

I'm starting with parallel processing in OpenCV an wonder why I should use parallel_for (from TBB) instead of just using multiple std::threads. As I understood the parallel_for functionality, you have to create a class extending cv::ParallelLoopBody with a method signature of void operator()(const cv::Range& range). This is where the processing then happens. But you cannot pass any arguments to this function, nor can you parametrise your parallel function in any way. All you have is this range of your thread, and the arguments you passed to the cv::ParallelLoopBody instance, which are the same for each thread. So you'd have to sort out your arguments with that range, e.g. passing a vector of Images to the cv::ParallelLoopBody instance and then using the range to extract the one you need. You'd have to do so for every single parameter that is thread-dependend.

So what's the benefit then compared to threads? I can bind any arbitrary function with (almost) arbitrary parameters with boost or C++11, without creating new classes for each task to be parallized. For this purpose I wrote a very primitive thread pool manager. Anything wrong with that?

cheers, stfn

P.S. I'm not an threading expert. I know there are memory access concerns when the functions I'm threading are using the same memory for writing. Reading is not the problem, but when two function write simultaniously e.g. on the same Mat, what is happening despite probable corrupted data due to race conditions? Is caching triggered, forcing the data to be up to date before writing? More generally: what do I need to take care of in terms of performance and data safety? Are those pitfalls already taken care of in TBB and this is why it is used in OpenCV?

TBB parallel_for vs std::thread

Hi,

I'm starting with parallel processing in OpenCV an wonder why I should use parallel_for (from TBB) instead of just using multiple std::threads. As I understood the parallel_for functionality, you have to create a class extending cv::ParallelLoopBody with a method signature of void operator()(const cv::Range& range). This is where the processing then happens. But you cannot pass any arguments to this function, nor can you parametrise your parallel function in any way. All you have is this range of your thread, and the arguments you passed to the cv::ParallelLoopBody instance, which are the same for each thread. So you'd have to sort out your arguments with that range, e.g. passing a vector of Images to the cv::ParallelLoopBody instance and then using the range to extract the one you need. You'd have to do so for every single parameter that is thread-dependend.

So what's the benefit then compared to threads? I can bind any arbitrary function with (almost) arbitrary parameters with boost or C++11, without creating new classes for each task to be parallized. For this purpose I wrote a very primitive thread pool manager. Anything wrong with that?

cheers, stfn

P.S. I'm not an threading expert. I know there are memory access concerns when the functions I'm threading are using the same memory for writing. Reading is not the problem, but when two function write simultaniously e.g. on the same Mat, what is happening despite probable corrupted data due to race conditions? Is caching triggered, forcing the data to be up to date before writing? More generally: what do I need to take care of in terms of performance and data safety? Are those pitfalls already taken care of in TBB and this is why it is used in OpenCV?

TBB parallel_for vs std::thread

Hi,

I'm starting with parallel processing in OpenCV an wonder why I should use parallel_for (from TBB) instead of just using multiple std::threads. As I understood the parallel_for functionality, you have to create a class extending cv::ParallelLoopBody with a method signature of void operator()(const cv::Range& range). This is where the processing then happens. But you cannot pass any arguments to this function, nor can you parametrise your parallel function in any way. All you have is this range of your thread, and the arguments you passed to the cv::ParallelLoopBody instance, which are the same for each thread. So you'd have to sort out your arguments with that range, e.g. passing a vector of Images to the cv::ParallelLoopBody instance and then using the range to extract the one you need. You'd have to do so for every single parameter that is thread-dependend.

So what's the benefit then compared to threads? I can bind any arbitrary function with (almost) arbitrary parameters with boost or C++11, without creating new classes for each task to be parallized. For this purpose I wrote a very primitive thread pool managermanager (.hpp. , .cpp). Anything wrong with that?

cheers, stfn

P.S. I'm not an threading expert. I know there are memory access concerns when the functions I'm threading are using the same memory for writing. Reading is not the problem, but when two function write simultaniously e.g. on the same Mat, what is happening despite probable corrupted data due to race conditions? Is caching triggered, forcing the data to be up to date before writing? More generally: what do I need to take care of in terms of performance and data safety? Are those pitfalls already taken care of in TBB and this is why it is used in OpenCV?

TBB parallel_for vs std::thread

Hi,

I'm starting with parallel processing in OpenCV an wonder why I should use parallel_for (from TBB) instead of just using multiple std::threads. As I understood the parallel_for functionality, you have to create a class extending cv::ParallelLoopBody with a method signature of void operator()(const cv::Range& range). This is where the processing then happens. But you cannot pass any arguments to this function, nor can you parametrise your parallel function in any way. All you have is this range of your thread, and the arguments you passed to the cv::ParallelLoopBody instance, which are the same for each thread. So you'd have to sort out your arguments with that range, e.g. passing a vector of Images to the cv::ParallelLoopBody instance and then using the range to extract the one you need. You'd have to do so for every single parameter that is thread-dependend.

So what's the benefit then compared to threads? I can bind any arbitrary function with (almost) arbitrary parameters with boost or C++11, without creating new classes for each task to be parallized. For this purpose I wrote a very primitive thread pool manager (.hpp, .cpp). Anything wrong with that?

cheers, stfn

P.S. I'm not an threading expert. I know there are memory access concerns when the functions I'm threading are using the same memory for writing. Reading is not the problem, but when two function write simultaniously e.g. on the same Mat, what is happening despite probable corrupted data due to race conditions? Is caching triggered, forcing the data to be up to date before writing? More generally: what do I need to take care of in terms of performance and data safety? Are those pitfalls already taken care of in TBB and this is why it is used in OpenCV?

EDIT: I ended up using tbb::task_group for parallelization and load balancing. Works like a charm.