1 | initial version |
There's a way to execute custom OpenCL code on the GPU from OpenCV using the cv::ocl classes and UMat type images.
Here you can find an example for doing this.
A word of warning: OpenCL code can be difficult to develop. It's massively parallel, has no concurrency checks (so it will crash for any bugs) and it's difficult to debug. OTOH OpenCL is very portable, it runs on most GPUs and on CPUs too (CUDA runs only on nVidia hardware).
I personally prefer using CPU parallelisation using TBB, it's much easier to implement.
Concerning CUDA, there is a cv::cuda namespace for CUDA-based operations. It has several image processing algorithms implemented, but I don't know if it's possible to run generic CUDA code.