Hi,
I've got several custom sections of CUDA kernel code that I need to debug. The application also contains OpenCV GPU code (but I'm not trying to debug any of that). If I comment all the OpenCV function calls, I can set a breakpoint in the NSight debugger and my breakpoint is hit just fine. If I put the OpenCV calls back in, my breakpoint is never hit. I believe the reason (although I'm not sure) is that all the OpenCV GPU kernel modules (of which there are about 40ish) are being loaded into the debugger and possibly causing a timeout or something in NSight... However, I'm not calling the majority of those kernels in my code, but I think they get loaded as part of starting up the OpenCV GPU subsystem. I've tried running against the release build of the OpenCV libraries as well with same issue.
So, my questions are:
(1) Has anyone else experienced this? (2) Can I either speedup or remove all the unused OpenCV CUDA kernel loads? (3) Are there configuration settings in NSight that anyone is aware of to make this work even with all the kernels being loaded?
I'm not that concerned with debugging speed or performance, but I'd like my breakpoint to be able to be hit while debugging my non-OpenCV code.
Thanks for any help! My System: Windows 7 64bit, Visual Studio 2010SP1, NSight 3.1, CUDA 5.5.12, OpenCV 2.4.6.0 built with CUDA support