Offloading ANN GEMM operations to GPU

asked 2020-04-28 02:32:12 -0600

I am trying to add functionality into the ANN_MLP object to offload the gemm operations in the predict function onto the GPU. I have a confirmed working build with NVIDIA CUDA able to query the GPU from OpenCV. Additionally, the ANN_MLP predict functionality is working properly.

What I did was add an identical function to "predict" called "predict_cuda" in the ANN_MLP source code. I then replaced the gemm(...) call with cv::cuda::gemm(...) and included the header to the cv::cuda::gemm reference (cudaarithm.hpp). In the ml.hpp header, I added the predict_cuda signature to the ANN_MLP object reference.

When running make in the build folder, I run into the issue:

../../lib/libopencv_ml.so.4.3.0: undefined reference to cv::cuda::gemm(cv::_InputArray const&, cv::_InputArray const&, double, cv::_InputArray const&, double, cv::_OutputArray const&, int, cv::cuda::Stream&)' ../../lib/libopencv_ml.so.4.3.0: undefined reference totypeinfo for cv::ml::ANN_MLP'

since the CUDA object is built later on and not linked with this target. Is there an easy way to configure cmake/make to build with this additional reference? Or am I missing something completely different?

My goal is to accelerate the ANN_MLP I have with NVIDIA CUDA; however, there is no direct ANN_MLP on GPU support. So, I have decided to try and integrate the cv::cuda::gemm operations into the OpenCV source.

edit retag flag offensive close merge delete

add a comment

Comments

I appreciate the quick response. I was not familiar with the cmake stucture in OpenCV so this helps!

I agree, probably not the best idea to have the ml module dependent on cuda/cudaarithm; however, I am researching the performance implications of this functionality.

voxincceo ( 2020-04-28 13:58:21 -0600 )edit

add a comment

Offloading ANN GEMM operations to GPU

1 answer

Comments

Links

Question Tools

Stats

Related questions

Offloading ANN GEMM operations to GPU edit

1 answer

Comments

Links

Question Tools

Stats

Related questions

Offloading ANN GEMM operations to GPU