Ask Your Question

Bug in GPU_SURF and OpenCV's OpenCL module?

asked 2013-07-02 18:31:50 -0600

CodeFinder gravatar image

updated 2013-07-04 04:19:35 -0600

Hi all,

I am using the CUDA-based SURF implementation to extract key points in real-time. Afterwards, the properties (response value and some specific entries of the description vector) of the key points are used as features for SVM classification. The CPU version of OpenCV's SURF implementation works as expected. However, both GPU variants (CUDA and OpenCL) do not because the entries of the description vectors differ w.r.t the CPU version. As a consequence, my SVM yields bad classification results which is shown by the following figures: CPU training results

GPU training results

The first 2 figures show the training results (decision boundary/support vectors and training samples of both classes) for the CPU version and the last 2 figures show the results of the OpenCL version, respectively (... for the same input, of course).

For example, the first 10 entries of the descriptors (using the CPU version) are as follows:

[ -0.0076033915, -3.6063837e-005, 0.0099780094, 6.7446483e-005, -0.037933301, -0.0013044993, 0.069664747, 0.0016015058, 0.0415693, -0.00026058216, ... ],

whereas the ocl:: version yields (for the same input):

[ 0.011228492, 0.00025490398, 0.011255275, 0.00027764714, -0.058959745, -0.0031194154, 0.061687313, 0.0032532166, 0.0554257, 0.0012606722, ... ].

The CUDA variant (gpu::) returns:

[ 3.9510203e-005, 0.0077131023, 0.00025647611, 0.007940894, 0.0021626796, 0.056031514, 0.0069103027, 0.056422122, -0.00029563709, 0.049984295, ... ].

However, the detector responses are identical in all 3 versions.

BTW: The ctor of GPU_SURF expects the parameter _keypointsRatio as stated in the documenation. What's this parameter for? (Maybe it's related to my issue?)

Another OpenCL related problem is also described here. In short: All apps (incl. the official samples) using ocl:: API calls crash when they terminate (due to a segfault). Unfortunately, a debugger doesn't provide any further hints as he just guides me to a disassembly (without any high level code):

Unhandled exception at 0x00000000 in oclocv.exe: 0xC0000005: Access violation at 0x0000000000000000.

Looks like someone accesses a 0-ptr?! Note that just calling ocl::getDevice() (or even without any ocl:: calls), the error does not occur. Is this a known issue as the OCL module is rather "experimental"?

Any help/hints is highly appreciated! Thanks in advance! :-)


  • GTX 560 Ti, GF114 (AFAIK, still Fermi arch.), Driver v320.49 (latest)
  • Win7-64 Prof. and CUDA SDK 5.0, OpenCV 2.4.5

PS: If you need more information, just ask! ;-)

EDIT: Formatting issue fixed.

UPDATE: The OCL crash seems to be a known bug, see .

edit retag flag offensive close merge delete

1 answer

Sort by ยป oldest newest most voted

answered 2013-07-06 09:58:58 -0600

CodeFinder gravatar image

updated 2013-07-06 12:50:02 -0600

Nevermind, I found a solution to the first issue: My application requires the key points to be sorted according to their response. While the CPU version does it automatically, the GPU versions don't. Unfortunately, I accidentally sorted (only) the key points but not the corresponding descriptors. Clearly, this caused a mismatch when indexing collected key points and descriptors in the training phase of my SVM.

IIRC, the problem arised somehow from the (OCL module) documentation which state:

At a minimuml level, it can be viewed as a set of accelerators, that can take advantage of the high compute throughput that GPU/APU devices can provide.

Even if this is somehow true, there are some essential differences which aren't documented (well) (e.g., the CPU version performs a sorting while the GPU versions don't).

In addition, it should be noted that there are some differences between CPU and GPU when comparing the descriptors (see also here). This is due some implementation aspects of the GPU. However, at least in my application, these differences seem to be acceptable.

I hope it helps someone else in avoiding this pitfall.

Note, however, using the cv::ocl module still causes an application crash on termination.

edit flag offensive delete link more


I suggest you submit a bugs against CUDA and OCL SURF (reproducer is appreciated) if you'd like to see the problem fixed.

Andrey Pavlenko gravatar imageAndrey Pavlenko ( 2013-10-08 10:57:31 -0600 )edit

Question Tools


Asked: 2013-07-02 18:31:50 -0600

Seen: 1,076 times

Last updated: Jul 06 '13