Ask Your Question

Poor OpenCL performance

asked 2014-02-13 01:59:32 -0500

qbeart gravatar image

updated 2014-02-14 15:07:20 -0500

berak gravatar image

Hi, I am trying to perform the detectMultiScale function on GPU using OpenCL module. It is supposed to run faster but it is not. In fact it is even 3-4 times slower than the CPU implementation. I have tested it on both Intel HD Graphics 4000 and NVidia GT650M, and I got the same result. I want to know if anyone ran into the same problem, and if there is a solution.

OpenCV version :

edit retag flag offensive close merge delete


It would be nice to know with which CPU you compare those GPUs. The GPUs you mention arent really powerful ones.

Moster gravatar imageMoster ( 2014-02-15 07:28:02 -0500 )edit

1 answer

Sort by ยป oldest newest most voted

answered 2014-02-14 13:37:12 -0500

updated 2014-02-14 13:38:09 -0500

Hello, I have run detectMultiScale with CUDA on a Dell T7600 with 2 CPU (each has 4 cores, 1.8 Ghz) and Quadro 4000 as well as TBB CPU version of that function but the results I got is not the same as yours: + TBB (4.2 update 2) CPU version only utilized about 60% of CPU resources and can work with a 14 fps rate. + CUDA version (5.5) just used 1 CPU core and reached 24 fps rate. I used opencv 3.0.0-dev built with VS 2012 update 4 on Windows 7 32 bit, CUDA 5.5. I think you should build Opencv with CUDA on your own to have better results.

edit flag offensive delete link more

Question Tools


Asked: 2014-02-13 01:59:32 -0500

Seen: 521 times

Last updated: Feb 14 '14