I'm trying to run multiple threads (std::thread), each of which opens a VideoCapture on a different device, captures frames, uploads them to the GPU, and computes ORB features & descriptors. If I run more than one thread the program fails on the call to detectAndComputeAsync. The exact error message is not consistent; on each run I see one of ~9 different errors, almost all of which include "an illegal memory access was encountered".
I've posted the (simplest version of the) offending code here:
https://gist.github.com/jrussino/c10aece599934797f4cf9e99b6f883aa
I also pasted each variant of error message that I observed into comment below that code snippet.
Notes:
- I'm using a different cuda::Stream for each thread's CUDA operations.
- This runs without error if I comment out the call to detectAndComputeAsync on line 49.
- This runs without error If I only run one thread (i.e. comment out lines 66 and 70).