Hello,
I am trying to speed up my code using UMats and I encountered the following weird issue:
The Code:
Mat image = imread("/tmp/lenna_2048.png", IMREAD_COLOR);
UMat workMat2 = image.getUMat(ACCESS_READ);
UMat workMat = UMat::zeros(image.size(), CV_8UC1);
double tt = 0;
auto start_time = std::chrono::high_resolution_clock::now();
for (int i = 0; i < 10; i++) {
cvtColor(workMat2, workMat, COLOR_RGB2GRAY);
tt = static_cast<double>(std::chrono::duration_cast<std::chrono::milliseconds>(std::chrono::high_resolution_clock::now() - start_time).count());
LOG4CXX_INFO(cwLogger, "i " << i << " tt " << tt);
}
Mat output;
workMat.copyTo(output);
double copy_time = static_cast<double>(std::chrono::duration_cast<std::chrono::milliseconds>(std::chrono::high_resolution_clock::now() - start_time).count());
LOG4CXX_INFO(cwLogger, "copy_time " << copy_time - tt);
The results:
i < 10; copy_time = 31ms
i < 20, copy_time = 52ms
i < 100, copy_time = 223ms
The GPU: 32-bit--Intel--Intel_R__HD_Graphics_Haswell_Ultrabook_GT2_Mobile--1_3 OpenCL: OpenCL C 1.2 beignet 1.3
I tried opencv version 3.4, 3.3.1, 3.2, etc. I tried different intel computer and even an nvidia machine and I get the same results.
It looks like the more I call the same operation, the bigger the time to copy the output (same size) takes. Am I misunderstanding something here ?