I have used OpenCV DNN module for object detection. It is seen that while setting preferable target of network to Opencl the forward pass slows down by a factor of ~10x( Windows as well as embedded platforms). I can even detect GPU device with OpenCL support (OpenCL C 1.2 ). Is there any reason why this could be happening?