Ask Your Question

wronglyNeo's profile - activity

2016-03-21 05:21:09 -0600 answered a question cuda::Stream in v3 (CudaMem, Stream::enqueUpload())

I think the way to go now works the following way:

//allocate page locked memory
cv::cuda::HostMem memory(rows, cols, type, cv::cuda::HostMem::PAGE_LOCKED);
//now copy something to the memory or fill it with sth. For example:
cv::Mat image = cv::imread(...);

//now upload to gpu mat
cv::cuda::GpuMat gpuImage;
cv::cuda::Stream stream;
//this should replace Stream::enqueueUpload
gpuImage.upload(memory, stream);
2016-03-20 06:37:21 -0600 received badge  Enthusiast
2016-03-19 11:46:41 -0600 commented question cuda::Stream in v3 (CudaMem, Stream::enqueUpload())

I'm also interested in this. It seems that the cv::cuda::CudaMem class has been removed, and instead there is the cv::cuda::HostMem class now.

2016-03-19 11:26:38 -0600 received badge  Supporter (source)
2016-01-08 21:03:10 -0600 received badge  Nice Answer (source)
2015-10-14 14:06:19 -0600 received badge  Teacher (source)
2015-10-14 14:06:17 -0600 received badge  Student (source)
2015-01-07 05:01:38 -0600 received badge  Self-Learner (source)
2015-01-07 01:48:59 -0600 answered a question Long delay on cv::gpu::GpuMat::upload after upgrade to GTX970

Ok, I figured it out. You have to tell the nvcc compiler to create binary code for the new device generation (compute capability 5.2 instead of 3.0 for the old card). When building the OpenCV project with cmake there is a variable CUDA_ARCH_BIN in the CUDA group that is currently set to 1.1 1.2 1.3 2.0 2.1(2.0) 3.0 3.5 per default. I added 5.2 to the list, generated and compiled again. Now it works fine.

2015-01-06 02:58:22 -0600 received badge  Editor (source)
2015-01-04 07:43:54 -0600 asked a question Long delay on cv::gpu::GpuMat::upload after upgrade to GTX970


I have been using the gpu module (cuda) of OpenCV in my program for a while and it worked fine. Now I upgraded my graphics card to a gtx970. Now, the first time I call cv::gpu::GpuMat::upload after launching the program I get a very long delay. With my old graphics card (GTX770) this completed nearly instantly.

Example: I have an image which is 512x600 pixels in size. With this image it takes 12s. If I execute the same code again afterwards without closing the program it works instantaneously. I know that the first time the CUDA code is executed after launching the program, it is compiled on the GPU, so a certain delay is normal. But to me this appears to be inexplicably long, especially because it was much faster with the old card.

Does anyone know what could cause this behaviour? Are there any known issues of the current OpenCV version in connection with GTX970 cards? The version I am using is 2.4.10 which is, apart from the 3.0beta, the latest one. I compliled OpenCV with CUDA when I still had my old Graphics Card. Could compiling it again with the new one help? (I wouldn't think so)


I now discovered that there is a Release of the CUDA Toolkit that specifically supports GTX970 and GTX980 cards:

I downloaded it and compiled OpenCV again with that one. Unfortunately, this didn't solve my problem. Somehow I have got the feeling it takes even longer now.

Is there no one here who has any experiences with GTX900 cards and OpenCV?