2017-03-14 01:47:53 -0600 | received badge | ● Popular Question (source) |
2017-03-14 01:47:53 -0600 | received badge | ● Notable Question (source) |
2017-03-14 01:47:53 -0600 | received badge | ● Famous Question (source) |
2015-11-26 10:50:22 -0600 | received badge | ● Scholar (source) |
2015-11-26 10:49:59 -0600 | answered a question | copy-pasted opencv code slower than precompiled code Solved! The compilation flags -g -G have to be removed in order to activate the default compilation flag -O3. Now the time performances are roughly equal. |
2015-11-25 03:59:08 -0600 | received badge | ● Student (source) |
2015-11-24 07:42:34 -0600 | received badge | ● Editor (source) |
2015-11-24 07:41:26 -0600 | asked a question | copy-pasted opencv code slower than precompiled code Hello, for my project I need to modify the openCV class cv::cuda::HOG (I need to introduce support for CV_8UC3 and cellSize = (16,16)). At this stage I still don't wont to modify the openCV source code, so I have created my cuda HOG descriptor (namely HOGtest), starting from the openCV code. Anyway, using the same openCV cv::cuda::HOG source code (just copied and pasted) I noticed that my code is noticeably slower. E.g.: for perform hog feature extraction on a 256x256 pixel image I have the following time measurements: Like I said, the code of HOGtest and cv::cuda::HOG is exactly the same. Performing code profiling with nvvp it turns out that the origin of this difference is the cuda kernels time execution. Does anyone know the reason of that? I attach in the following the part of my .pro file (I am using the Qt framework) where I compile the cuda code with nvcc Thanks for the help |
2015-11-17 10:39:59 -0600 | commented question | CUDA HaarCascade stream assertion I have the same problem in cv::cuda::HOG::compute. I suspect that the multi-stream functionality is still not supported by openCV libraries. |
2015-10-27 02:28:25 -0600 | received badge | ● Enthusiast |
2015-10-26 05:00:42 -0600 | asked a question | Segfault in multithreaded cuda::HOG::compute calls Hello, I'm experiencing a strange issue in opencv cuda module using the function cv::cuda::HOG::compute in a multi-threaded application. THE ARCHITECTURE: I have a main() function which launches 4 independent threads. In each thread a set of images is analysed with HOG features extraction technique + SVM prediction in order to detect some specific features. In more details, in each image some ROIs are extracted and then analysed with cuda::HOG::compute method. I'm using opencv-3.0.0 with cuda library 7.5, on a GPU nvidia GeForce GTX 970. THE CODE: In the following the relevant GPU-code of HOG + SVM analysis which is used to analyse each ROI: As said, this code is repeated more times independently in each thread. In the following link there is a small running Qt project that encapsulate the core of my software (analysing in this case more times the same image) In the first lines of main() function the number of threads to be launched, the size of the set of images to be analysed in each thread, and the boolean switch from GPU to CPU can be set. Qt sample THE ERROR: If I launch the code with one single thread there are no errors and the analysis is correctly performed. But when I launch the code with 2 or more threads I get occasionally (about 2 times over 3) the following error: The number of errors is equal to the number of threads launched. The functions returning error change every time, except for compute_hists, so I suppose that the problem is there. When the code doesn't return error the analysis is performed correctly, independently from the size of image, the number of threads, or the size of the image sample that ... (more) |
2015-10-22 04:03:30 -0600 | commented question | segfault with multithreaded gpu calls @stfn: Hello. Did you find a solution to your problem? I am facing the same occasional error with opencv 3.0.0 with cuda::HOG (in compute() function) in a multi-threading software. Thanks. |