How to pass an already stored data in GPU by GpuMat to a kernel
Hello,
I have a GpuMat of OpenCV type created as
cv::cuda::GpuMat d_im(h_im.size().height, h_im.size().width, CV_8UC1);
that I am doing some image processing operations using opencv::cuda and then according to the OpenCV documentation I have tried to pass it directly to the kernel function below as :
Kernel_func<<<grid_size, block_size, 0, stream>>>( d_im.ptr<uint8_t>(), output);
but I got wrong results.
However, it was okay if the new d_im is downloaded from GPU to CPU and then copy it again to the GPU by cudaMemcpy as in this code snippet below (with no problems). I know this is not okay to do.
CUDA_CHECK_RETURN(cudaMemcpyAsync(input, h_im_new.ptr<uint8_t>(), sizeof(uint8_t)*size, cudaMemcpyHostToDevice,stream1));
My global function prototype is :
__global__ void Kernel_func(const uint8_t *input, const uint8_t *output);
I am not sure what is wrong in this case, please anyone had similar issue or any suggestions. Thanks for your help
This is my first post here; I made an account just to comment on this! I too am observing this exact behavior, I have tried every possible way of passing the GpuMat to the kernel as a float* pointer, but regardless only the first third or so of the image (roughly 1500 bytes) is valid, the rest is zero.
I tried: Allocating a local GpuMat and using copyTo() to copy the original GpuMat data manually there. *Allocate a float with cudaMalloc(), and using cudaMemcpy(...DeviceToDevice) to store the raw data. *Using both gpumat.ptr<float>(); and gpumat.data() functions.
The only thing that works is copying the data to Host memory, then back to a GpuMat using gpumat.download(mat), then using cudaMemcpy(...HostToDevice) to copy it to a float*. I can only conclude that this is a bug with OpenCV!
As the documentation says