Block matching CPU utilization - 2 processes faster than 2 threads

asked 2019-05-06 07:38:56 -0500

Yoni gravatar image

Hi,

I have an algorithm which is using the Open CV block matching algorithm internally in order to calculate the Disparity map between 2 separate stereo channels. Everything is working fine, results wise.

However, I have not yet understood how come CPU utilization is so low, because the algorithm is very parallel as far as I saw. on a modern i7 with 8 threads, I see ~30% CPU utilization.

But, the specific problem I have is weirder than that. I have 2 stereo channels (4 cameras overall): Channel 1 running on main thread (Or worker thread, it makes no diff) Channel 2 running on a worker thread

CPU utilization this way is ~30%.

But, and here is the really weird issue: If I'm building 2 separate EXEs, 1 EXE for each channel and running both EXEs in parallel, I see better results per channel, and CPU utilization is doubled, ~60%

I wonder how that is even possible. Threads should be faster than processes as far as I know.

The funny thing is that if I add to the 2 EXEs a 3rd EXE with both channels, all running at the same time, CPU is then utilized to ~85% but I see some ~35% reduction in the time it take for the 2 channels process to run (But I guess that is expected)

Thanx for any help!

edit retag flag offensive close merge delete

Comments

1

opencv is using massive data parallelization internally already. if not all of your cores are maxed out already, the devs there are doing something wrong.

usually, it's not worth/feasible trying task parallelization on top of it.

berak gravatar imageberak ( 2019-05-06 08:27:18 -0500 )edit
1

I see. But honestly I don't know if I should expect 100% CPU usage with Open CV Block matching algorithm. Even in a standalone, command line based program that do nothing but prepare Open CV BM instance and run 2 static stereo images on it, CPU here is at ~15% only (But performance are great). So I believe that the OpenCV BM has at least some sequential limitations. But still, that doesn't explain how come 2 processes can utilize more CPU than 2 threads doing ~same main task.

Yoni gravatar imageYoni ( 2019-05-07 00:41:28 -0500 )edit