I am trying to copy GpuMat data to the Triton Inference Server shared memory. However, a GpuMat is not continuous, but the Inference Server expects continuous data. Downloading the gpuMat to a cv::Mat and then do a cudaMemcpy of the cv::Mat into the shared memory of the Inference server works, but I want to copy it to the gpu shared memory of the inference server without downloading it. But I'm struggling to find out how I should do that.
My input GpuMat is a 600x600 color image (3 layers) converted to float.
What is the data structure of the GpuMat and / or how to make it continuous or copy it so it becomes continuous?
I tried with cudaMemcpy2D, I tried just a cudaMemcpy but row by row, I tried to create a continuous gpumat with cv::cuda::createContinuous and then copyTo from the input to the continuous array.