cv::cuda::setDevice() takes forever, how can I speed it up?

asked 2018-06-10 23:27:39 -0500

Running

cv::cuda::setDevice(0);

takes a whopping 48.48 seconds on my machine. This seems way too slow. Any idea what's wrong, and how I can speed it up?

GPU info:

Device 0: "GeForce GTX 1060 6GB"
  CUDA Driver Version / Runtime Version          8.0 / 7.5
  CUDA Capability Major/Minor version number:    6.1

Relevant CMAKE flags when I compiled OpenCV:

-D WITH_CUBLAS=1 \
-D ENABLE_FAST_MATH=1 \
-D CUDA_FAST_MATH=1 \
-D CUDA_ARCH_PTX=5.2 \

While googling the issue, I read the issue might be due to JIT compiling, and that I should compile the binaries with specific compute flags. However I'm not exactly sure what this means...are there some flags I should add when compiling my program, or do I need to somehow reinstall CUDA with more architectures enabled?

edit retag flag offensive close merge delete