When using OpenCL in OpenCV 3.1 (so using UMat instead of Mat), I observe significant latency in the first call of a function.
The project I am working on has some performance demands and will be ran in independent processes on the same machine with the same filters/settings (only the input image will change).
I am using OpenCV 3.1 in VS2015.
For example:
UMat input, output;
input = UMat::zeros(2048, 2048, CV_8UC1);
randn(input, 0, 3);
for (int i = 0; i < 10; i++)
{
chrono::high_resolution_clock::time_point begin = chrono::high_resolution_clock::now();
GaussianBlur(input, output, Size(5, 5), 1.5, 1.5);
chrono::high_resolution_clock::time_point end = chrono::high_resolution_clock::now();
long long ms = chrono::duration_cast<chrono::milliseconds>(end - begin).count();
cout << ms << " ms" << endl;
}
This will give the following result:
144 ms
0 ms
0 ms
0 ms
0 ms
0 ms
0 ms
0 ms
0 ms
0 ms
I checked whether OpenCL was enabled using ocl::haveOpenCL()
and explicitly enabeling it with ocl::setUseOpenCL(true)
does not make a difference.
This issue was already addressed in this question and this one but not yet solved.
One of the causes of this latency could be the initialization of the OpenCL runtime at the first call. Is it possible to save the initialized state of OpenCL to file at the first run of my program and load it for consecutive runs? I know the ocl-module has classes like Program, ProgramSource and Context. However, I am not familiar with OpenCL in general and the documentations on these classes and how to use its members is nowhere to be found.