Ask Your Question

Revision history [back]

click to hide/show revision 1
initial version

Does using cuda::Stream in cuda function calls always have an effect?

I was just experimenting with using Streams with cuda functions to see what kind of performance impact they have, but it seems like very often the function takes just as long to launch with and without using the stream parameter. I was expecting the call with the stream to launch an asynchronous operation which would essentially take 0 time, but it seems to be the full time of the operation itself.

For example, with a gpu image uploaded I tried

gpuImage.download(cpuImage);

and

cv::cuda::Stream myStream;
gpuImage.download(cpuImage, myStream);

and I've seen no time difference whatsoever.

Is there something I'm missing in how to use these?

Does using cuda::Stream in cuda function calls always have an effect?

I was just experimenting with using Streams with cuda functions to see what kind of performance impact they have, but it seems like very often the function takes just as long to launch with and without using the stream parameter. I was expecting the call with the stream to launch an asynchronous operation which would essentially take 0 time, but it seems to be the full time of the operation itself.

For example, with a gpu image GpuMat uploaded I tried

gpuImage.download(cpuImage);

and

cv::cuda::Stream myStream;
gpuImage.download(cpuImage, myStream);

and I've seen no time difference whatsoever.

Is there something I'm missing in how to use these?

Does No effect from using cuda::Stream in cuda function calls always have an effect?cuda::Stream?

I was just experimenting with using Streams with cuda functions to see what kind of performance impact they have, but it seems like very often the function takes just as long to launch with and without using the stream parameter. I was expecting the call with the stream to launch an asynchronous operation which would essentially take 0 time, but it seems to be the full time of the operation itself.

For example, with a GpuMat uploaded I tried

gpuImage.download(cpuImage);

and

cv::cuda::Stream myStream;
gpuImage.download(cpuImage, myStream);

and I've seen no time difference whatsoever.

Is there something I'm missing in how to use these?