Ask Your Question

How to run OpenCV DNN on NVidia GPU

asked 2018-10-19 03:38:07 -0500

gradlaserb gravatar image


I want to use my Nvidia GTX 1060 GPU when I run with my DNN code. I am using OpenCV. I tried with CPU, However, It is absolutely slow. So, I change this line,




And then, I get an error that is [ WARN:0] DNN: OpenCL target is not supported with current OpenCL device (tested with Intel GPUs only), switching to CPU.

How Can I solve this problem? I have OpenCV 3.4.3.

edit retag flag offensive close merge delete

1 answer

Sort by ยป oldest newest most voted

answered 2018-10-19 04:03:54 -0500

berak gravatar image

updated 2018-10-19 04:26:54 -0500

using 3.4.3, you can't do much about it. situation has improved a bit on master branch, but imho, it's still WIP.

you could try to set the env var: OPENCV_DNN_OPENCL_ALLOW_ALL_DEVICES=1 (maybe that gets you around it, but i'm only guessing)

apart from that, all you can do now is -- try a different model / network / architecture.

if it's a detection one, you can also try to use smaller input windows (e.g. 150x150 instead of 300x300)

edit flag offensive delete link more


Ok, does that mean that Yolov3 (which has been added to OpenCV) cannot use cuDNN for maximum speed? If not, are there plans to add this support?

AlexTheGreat gravatar imageAlexTheGreat ( 2018-10-19 05:00:04 -0500 )edit

@AlexTheGreat , - no idea about cuDNN, but there is no support for CUDA (with opencv's dnn module), and no plan to add such.

berak gravatar imageberak ( 2018-10-19 05:03:50 -0500 )edit

Hm, that's a bit surprising though, because OpenCV used to inform about and already has CUDA support ( ), and using CUDA and cuDNN in the OpenCV DNN implementation would be a natural step forward, or I am missing something?

AlexTheGreat gravatar imageAlexTheGreat ( 2018-10-19 05:41:57 -0500 )edit

(don't look at outdated 2.4, which was frozen 5 years ago !)

berak gravatar imageberak ( 2018-10-19 05:43:41 -0500 )edit

We've merged a PR which lets run networks with OpenCL without extra flags:

dkurt gravatar imagedkurt ( 2018-10-19 10:53:52 -0500 )edit

I see, thanks. But to come back to the original question, because I am still not clear about it. Does that mean that we can somehow accelerate the DNN implementation in OpenCV including YOLO with a GPU (Intel, NVidia)?

AlexTheGreat gravatar imageAlexTheGreat ( 2018-10-20 03:43:05 -0500 )edit

@AlexTheGreat -- try with latest 3.4 or master branch (NOT any releases !) and

berak gravatar imageberak ( 2018-10-20 03:58:32 -0500 )edit

I use OpenCV 4.1.1 on Nvidia Tegra Nano compiled with CUDA support. I compiled Darknet with CUDA and cuDNN support as well. Still, running net.setPreferableTarget(DNN_TARGET_OPENCL); net.forward(...); shows 100% of all CPU core usage, then swap memory occupied, then system frozen.

Update: Nvidia Nano is not support OpenCL :-(

YuriiChernyshov gravatar imageYuriiChernyshov ( 2019-07-21 09:21:28 -0500 )edit

I confirm same behavior as of today (OPENCL + jetson nano)

stiv-yakovenko gravatar imagestiv-yakovenko ( 2019-08-25 16:24:57 -0500 )edit

@stiv-yakovenko you can perform inference on Jetson using

Yashas gravatar imageYashas ( 2019-09-02 02:22:51 -0500 )edit
Login/Signup to Answer

Question Tools

1 follower


Asked: 2018-10-19 03:38:07 -0500

Seen: 7,421 times

Last updated: Oct 19 '18