Revision history [back]

Well, trying to solve my problem, I experimented that it was not a problem of CUDA Context: applications written using both CUDA and OpenCV are traced well by the Profiler. Instead, it was a problem of memory: simply, in the application that contains both the CUDA version and the OpenCV version of my algorithm, I use a number of streams that is twice the size of that I use in the applications with only one version of the algorithm, and this exceeds the memory capacity of the Profiler. I thought that it was a problem of the Profiler besause the application with the two methods runs correctly, and it only stops when I run it from the Profiler in the "Enable concurrent kernels execution" modality to trace the timeline. This must be explained by the fact that the Profiler uses much more memory to trace the timeline in this modality, so the limit of the number of streams is lower than in the synchronous modalitiy. However, I am a beginner, so I'd better not advance hypotheses riskly. I solved it out using fewer streams. I apologize for the misleading question.