Segfault in multithreaded cuda::HOG::compute calls
Hello, I'm experiencing a strange issue in opencv cuda module using the function cv::cuda::HOG::compute in a multi-threaded application.
THE ARCHITECTURE: I have a main() function which launches 4 independent threads. In each thread a set of images is analysed with HOG features extraction technique + SVM prediction in order to detect some specific features. In more details, in each image some ROIs are extracted and then analysed with cuda::HOG::compute method.
I'm using opencv-3.0.0 with cuda library 7.5, on a GPU nvidia GeForce GTX 970.
THE CODE: In the following the relevant GPU-code of HOG + SVM analysis which is used to analyse each ROI:
// Extracting (256x256 pixel) roi square
Mat ROI = sampleImage( Rect( Point(X1, Y1), Point(X2, Y2) ) );
// HOG feature extraction
cuda::GpuMat cudaROI(ROI);
cuda::cvtColor(cudaROI, cudaROI, CV_BGR2GRAY, 1);
cuda::GpuMat descriptorsValuesGpu;
Ptr<cuda::HOG> hog = cuda::HOG::create(Size(256,256), Size(16,16), Size(8,8), Size(8,8), 9);
hog->compute( cudaROI, descriptorsValuesGpu ); // *** crash here in multi-thread ***
// svm prediction
vector<float> descriptorsValues;
descriptorsValues.resize(descriptorsValuesGpu.cols);
descriptorsValuesGpu.row(0).download(Mat(descriptorsValues).reshape(1,1));
previsionScore = svm_loc->predict(descriptorsValues, noArray(), 1);
As said, this code is repeated more times independently in each thread.
In the following link there is a small running Qt project that encapsulate the core of my software (analysing in this case more times the same image) In the first lines of main() function the number of threads to be launched, the size of the set of images to be analysed in each thread, and the boolean switch from GPU to CPU can be set. Qt sample
THE ERROR: If I launch the code with one single thread there are no errors and the analysis is correctly performed. But when I launch the code with 2 or more threads I get occasionally (about 2 times over 3) the following error:
OpenCV Error: Gpu API call (an illegal memory access was encountered) in call, file /home/figaro/opencv-3.0.0/modules/cudev/include/opencv2/cudev/grid/detail/transform.hpp, line 273
OpenCV Error: Gpu API call (an illegal memory access was encountered) in compute_hists, file /home/figaro/opencv-3.0.0/modules/cudaobjdetect/src/cuda/hog.cu, line 222
OpenCV Error: Gpu API call (an illegal memory access was encountered) in set_up_constants, file /home/figaro/opencv-3.0.0/modules/cudaobjdetect/src/cuda/hog.cu, line 93
OpenCV Error: Gpu API call (an illegal memory access was encountered) in upload, file /home/figaro/opencv-3.0.0/modules/core/src/cuda/gpu_mat.cu, line 179
terminate called after throwing an instance of 'QtConcurrent::UnhandledException'
what(): std::exception
The number of errors is equal to the number of threads launched. The functions returning error change every time, except for compute_hists, so I suppose that the problem is there. When the code doesn't return error the analysis is performed correctly, independently from the size of image, the number of threads, or the size of the image sample that ...
I face the same issue with ORB. Although I am using detectAndComputeAsync along with CUDA streams, when I try to perform multi-threaded detect, it crashes with similar errors.