Gabriele D's profile - activity

overview network karma followed questions activity

2017-03-14 01:47:53 -0600	received badge	● Popular Question (source)
2017-03-14 01:47:53 -0600	received badge	● Notable Question (source)
2017-03-14 01:47:53 -0600	received badge	● Famous Question (source)
2015-11-26 10:50:22 -0600	received badge	● Scholar (source)
2015-11-26 10:49:59 -0600	answered a question	copy-pasted opencv code slower than precompiled code Solved! The compilation flags -g -G have to be removed in order to activate the default compilation flag -O3. Now the time performances are roughly equal.
2015-11-25 03:59:08 -0600	received badge	● Student (source)
2015-11-24 07:42:34 -0600	received badge	● Editor (source)
2015-11-24 07:41:26 -0600	asked a question	copy-pasted opencv code slower than precompiled code Hello, for my project I need to modify the openCV class cv::cuda::HOG (I need to introduce support for CV_8UC3 and cellSize = (16,16)). At this stage I still don't wont to modify the openCV source code, so I have created my cuda HOG descriptor (namely HOGtest), starting from the openCV code. Anyway, using the same openCV cv::cuda::HOG source code (just copied and pasted) I noticed that my code is noticeably slower. E.g.: for perform hog feature extraction on a 256x256 pixel image I have the following time measurements: `time with cv::cuda::HOG: about 1 ms time with my HOGtest class: about 6 ms` Like I said, the code of HOGtest and cv::cuda::HOG is exactly the same. Performing code profiling with nvvp it turns out that the origin of this difference is the cuda kernels time execution. Does anyone know the reason of that? I attach in the following the part of my .pro file (I am using the Qt framework) where I compile the cuda code with nvcc CUDA_SOURCES += cuda_test.cu CUDA_SDK = "/usr/local/cuda-7.5/samples/" CUDA_DIR = "/usr/local/cuda-7.5/" CUDA_ARCH = sm_52 NVCCFLAGS = --compiler-options -fno-strict-aliasing -use_fast_math --ptxas-options=-v INCLUDEPATH += $$CUDA_DIR/include INCLUDEPATH += $$CUDA_SDK/common/inc/ INCLUDEPATH += $$CUDA_SDK/../shared/inc/ QMAKE_LIBDIR += $$CUDA_DIR/lib64 QMAKE_LIBDIR += $$CUDA_SDK/lib QMAKE_LIBDIR += $$CUDA_SDK/common/lib LIBS += -L/usr/local/cuda-7.5/lib64/ \ -lcuda \ -lcudart CUDA_INC = $$join(INCLUDEPATH,' -I','-I',' ') cuda.input = CUDA_SOURCES cuda.output = ${OBJECTS_DIR}${QMAKE_FILE_BASE}_cuda.o cuda.commands = $$CUDA_DIR/bin/nvcc -m64 -g -G -arch=$$CUDA_ARCH -c $$NVCCFLAGS $$CUDA_INC $$LIBS ${QMAKE_FILE_NAME} -o ${QMAKE_FILE_OUT} cuda.dependency_type = TYPE_C cuda.depend_command = $$CUDA_DIR/bin/nvcc -g -G -M $$CUDA_INC $$NVCCFLAGS ${QMAKE_FILE_NAME} QMAKE_EXTRA_COMPILERS += cuda Thanks for the help
2015-11-17 10:39:59 -0600	commented question	CUDA HaarCascade stream assertion I have the same problem in cv::cuda::HOG::compute. I suspect that the multi-stream functionality is still not supported by openCV libraries.
2015-10-27 02:28:25 -0600	received badge	● Enthusiast
2015-10-26 05:00:42 -0600	asked a question	Segfault in multithreaded cuda::HOG::compute calls Hello, I'm experiencing a strange issue in opencv cuda module using the function cv::cuda::HOG::compute in a multi-threaded application. THE ARCHITECTURE: I have a main() function which launches 4 independent threads. In each thread a set of images is analysed with HOG features extraction technique + SVM prediction in order to detect some specific features. In more details, in each image some ROIs are extracted and then analysed with cuda::HOG::compute method. I'm using opencv-3.0.0 with cuda library 7.5, on a GPU nvidia GeForce GTX 970. THE CODE: In the following the relevant GPU-code of HOG + SVM analysis which is used to analyse each ROI: // Extracting (256x256 pixel) roi square Mat ROI = sampleImage( Rect( Point(X1, Y1), Point(X2, Y2) ) ); // HOG feature extraction cuda::GpuMat cudaROI(ROI); cuda::cvtColor(cudaROI, cudaROI, CV_BGR2GRAY, 1); cuda::GpuMat descriptorsValuesGpu; Ptr<cuda::HOG> hog = cuda::HOG::create(Size(256,256), Size(16,16), Size(8,8), Size(8,8), 9); hog->compute( cudaROI, descriptorsValuesGpu ); // * crash here in multi-thread * // svm prediction vector<float> descriptorsValues; descriptorsValues.resize(descriptorsValuesGpu.cols); descriptorsValuesGpu.row(0).download(Mat(descriptorsValues).reshape(1,1)); previsionScore = svm_loc->predict(descriptorsValues, noArray(), 1); As said, this code is repeated more times independently in each thread. In the following link there is a small running Qt project that encapsulate the core of my software (analysing in this case more times the same image) In the first lines of main() function the number of threads to be launched, the size of the set of images to be analysed in each thread, and the boolean switch from GPU to CPU can be set. Qt sample THE ERROR: If I launch the code with one single thread there are no errors and the analysis is correctly performed. But when I launch the code with 2 or more threads I get occasionally (about 2 times over 3) the following error: OpenCV Error: Gpu API call (an illegal memory access was encountered) in call, file /home/figaro/opencv-3.0.0/modules/cudev/include/opencv2/cudev/grid/detail/transform.hpp, line 273 OpenCV Error: Gpu API call (an illegal memory access was encountered) in compute_hists, file /home/figaro/opencv-3.0.0/modules/cudaobjdetect/src/cuda/hog.cu, line 222 OpenCV Error: Gpu API call (an illegal memory access was encountered) in set_up_constants, file /home/figaro/opencv-3.0.0/modules/cudaobjdetect/src/cuda/hog.cu, line 93 OpenCV Error: Gpu API call (an illegal memory access was encountered) in upload, file /home/figaro/opencv-3.0.0/modules/core/src/cuda/gpu_mat.cu, line 179 terminate called after throwing an instance of 'QtConcurrent::UnhandledException' what(): std::exception The number of errors is equal to the number of threads launched. The functions returning error change every time, except for compute_hists, so I suppose that the problem is there. When the code doesn't return error the analysis is performed correctly, independently from the size of image, the number of threads, or the size of the image sample that ... (more)
2015-10-22 04:03:30 -0600	commented question	segfault with multithreaded gpu calls @stfn: Hello. Did you find a solution to your problem? I am facing the same occasional error with opencv 3.0.0 with cuda::HOG (in compute() function) in a multi-threading software. Thanks.