clBuildProgram() can be slow, but its results are cached (by OpenCV and by some OpenCL implementations). Can you explain how to contribute to overall speed-up and use pre-built kernels, what to set?
1 | initial version |
clBuildProgram() can be slow, but its results are cached (by OpenCV and by some OpenCL implementations). Can you explain how to contribute to overall speed-up and use pre-built kernels, what to set?