THE PROBLEM: i have more CPU thread; each thread has to use OpneCV GPU function in real time. So GPU has to create more context and switch from one context to another one, before calling OpenCV function.

SOLUTION: Manually create context (cuCtxCreate) for each thread, pop the previous one from the stack and push the new context. Execute function on GPU. Pop current context.

At the end, delete context. For all this function (create, pop, push and destroy) use CUDA routines.