Hi!
I want to match some templates against an Image. To speed the process up I wanted to utilize OPENCL but the opposite happens.
Here is a small example source code:
import cv2
import numpy as np
img = np.uint8(np.random.random_integers(0,255,(500,500))) # random Image data
tpls = [np.uint8(np.random.random_integers(0,255,(20,20))) for i in range(0,100)] # random templates
# OPENCL off
s = []
a = cv2.getTickCount()
s = [cv2.matchTemplate(img,tpl,cv2.TM_CCORR_NORMED) for tpl in tpls]
b = cv2.getTickCount()
print((b-a)/cv2.getTickFrequency())
img = cv2.UMat(img) # convert img to UMat
tpls = [cv2.UMat(tpl) for tpl in tpls] # convert templates to UMat
# OPENCL on
s = []
a = cv2.getTickCount()
s = [cv2.matchTemplate(img,tpl,cv2.TM_CCORR_NORMED) for tpl in tpls]
b = cv2.getTickCount()
print((b-a)/cv2.getTickFrequency())
The results are as following:
Hardware: Intel Core i7 8650U vs Intel UHD Graphics 620 vs NVIDIA GeForce GTX1050
Software: OPENCV 3.4.2 on Python 3.6.6
CPU: 0.342378
UHD 620: 0.8755508
GTX 1050: 0.6655146
My taskmanager shows that the intended GPU is used. (I use environment variable OPENCV_OPENCL_DEVICE :GPU:0 for UHD and :GPU:1 for GTX)
Some ideas what I can do to get better performance? Maybe someone could try my code and tell me if he gets similar results?
Thank you in Advance
gw