Revision history [back]

Offloading ANN GEMM operations to GPU

I am trying to add functionality into the ANN_MLP object to offload the gemm operations in the predict function onto the GPU. I have a confirmed working build with NVIDIA CUDA able to query the GPU from OpenCV. Additionally, the ANN_MLP predict functionality is working properly.

What I did was add an identical function to "predict" called "predict_cuda" in the ANN_MLP source code. I then replaced the gemm(...) call with cv::cuda::gemm(...) and included the header to the cv::cuda::gemm reference (cudaarithm.hpp). In the ml.hpp header, I added the predict_cuda signature to the ANN_MLP object reference.

When running make in the build folder, I run into the issue:

../../lib/libopencv_ml.so.4.3.0: undefined reference to cv::cuda::gemm(cv::_InputArray const&, cv::_InputArray const&, double, cv::_InputArray const&, double, cv::_OutputArray const&, int, cv::cuda::Stream&)' ../../lib/libopencv_ml.so.4.3.0: undefined reference totypeinfo for cv::ml::ANN_MLP'

since the CUDA object is built later on and not linked with this target. Is there an easy way to configure cmake/make to build with this additional reference? Or am I missing something completely different?

My goal is to accelerate the ANN_MLP I have with NVIDIA CUDA; however, there is no direct ANN_MLP on GPU support. So, I have decided to try and integrate the cv::cuda::gemm operations into the OpenCV source.