OpenCL target for DNN slower than CPU
- OpenCV 4.0.1
- Intel Core i5-7400, compute-runtime 19.05.12254
Using openpose.py example on HAND dataset.
CPU target takes about 850ms per frame, OpenCL ~1.4s. Measuring with time
module as net.getPerfProfile()
lies about inference times. Enabling OpenCL SVM does not help.
Is it simply how it is, or there is some magic to speed computations up? How come broadly advertised (by Intel, no blame on opencv) GPGPU is actually slower than vectorized CPU version?