Ask Your Question

Revision history [back]

Block matching CPU utilization - 2 processes faster than 2 threads

Hi,

I have an algorithm which is using the Open CV block matching algorithm internally in order to calculate the Disparity map between 2 separate stereo channels. Everything is working fine, results wise.

However, I have not yet understood how come CPU utilization is so low, because the algorithm is very parallel as far as I saw. on a modern i7 with 8 threads, I see ~30% CPU utilization.

But, the specific problem I have is weirder than that. I have 2 stereo channels (4 cameras overall): Channel 1 running on main thread (Or worker thread, it makes no diff) Channel 2 running on a worker thread

CPU utilization this way is ~30%.

But, and here is the really weird issue: If I'm building 2 separate EXEs, 1 EXE for each channel and running both EXEs in parallel, I see better results per channel, and CPU utilization is doubled, ~60%

I wonder how that is even possible. Threads should be faster than processes as far as I know.

The funny thing is that if I add to the 2 EXEs a 3rd EXE with both channels, all running at the same time, CPU is then utilized to ~85% but I see some ~35% reduction in the time it take for the 2 channels process to run (But I guess that is expected)

Thanx for any help!