UMat implementations slower than Mat
Hi. I have made an OpenCV based application using C++ and successfully run it on Android using NDK. It uses camera input for processing. Right now the fps on PC with core i5 using only CPU is about 27 FPS from 30 FPS input. On Android with snapdragon 616 FPS is about 6 :) using only CPU. I need to make it run faster so I discovered that OpenCV 3 and upper has T-API with UMat that runs on GPU using OpenCL implementations. So I build OpenCV with WITH_OPENCL flag and successfully can run with UMat. But now FPS dropped to about 10 on PC with GeForce GTX650. Why?? In their website it says that UMat implementations are several times faster. Then I did an experiment run 100 iterations of functions that I use the most in my program and then calculated mean time of their execution. You can see it attached. Showed execution time in ms (640x480). And also log from getBuildInformation about OpenCl.
So my question is have you experienced the same with UMat or maybe I am doing something wrong?