# ORB_GPU not as good as ORB(CPU)

Hi all,

I have some code working decently with the CPU implementation of ORB (from module/features2d); now I am experimenting with ORB_GPU, hoping to have a 2-way implementation, users with CUDA get faster performance, users without still get good quality.

Problem is, the keypoints/descriptors returned by ORB_GPU are not yielding a sufficient number of correct matches; on exactly the same data that the CPU version is able to. I understand that synchronization issues etc may cause the GPU results to be different than CPU, but I would hope that the quality of the results would be comparable. Any tips?

In particular, is there any way I can wrangle the ORB_GPU interface to compute descriptors for keypoints found by the CPU ORB? Maybe then I could isolate to either keypoint extraction or descriptor computation.

Dirty details; I just downloaded CUDA 5.0 and built OpenCV2.4.4 for myself with VS2008 (previously I was using prebuilt OpenCV2.4.4, but I guess that was built with HAVE_CUDA=0). For my development testing I have obtained a rather old, low-end card (Quadro FX 4800, capability level 1.3). I am using ORB/ORB_GPU only for keypoints/descriptors. I have written my own matching code (CPU-based) which is used after the kp/dsc are extracted. I am using the same computer/compiler/data/etc for testing each way, just recompiling with/without the gpu code path. Here's a snippet:

   vector<cv::KeyPoint> fkps, rkps;
cv::Mat fdescs, rdescs;

int ngpus = cv::gpu::getCudaEnabledDeviceCount();
if (ngpus > 0) { // compile this way for GPU
//if (0) {       // compile this way to force CPU
cv::gpu::GpuMat gpumat(fmat);
cv::gpu::ORB_GPU orb(1000);
cv::gpu::GpuMat gpudsc;

orb.release();
gpudsc.release();
gpumat.release();

gpudsc.release();
gpumat.release();
} else {
cv::ORB orb(1000); // DEF 500 features
orb(fmat, cv::Mat(), fkps, fdescs);
orb(rmat, cv::Mat(), rkps, rdescs);
}
// now go through fdescs/rdescs and find matches


I am getting some results, so I must have compiled/linked/etc OK, but the results are significantly worse with GPU. In particular, with my baseline "easy" unit test case my matcher is detecting 34 matches (out of the 1000 kp/dsc per image), and GPU is yielding 5-9 (not deterministic--which is OK if I can get it to be reliably good). After this matching I run RANSAC to find a maximal subset that fits well to a homography, and the CPU kp/desc winds up with 20 correct matches, but GPU never yields better than 4 (i.e. trivial homography fit, and not all correct matches).

Any feedback would be appreciated!

UPDATE: I noticed that the CPU implementation offers optional parameter useProvidedKeypoints=false -- so I modified my code to ignore the ORB_GPU descriptors and let ORB(CPU) compute descriptors ...

edit retag close merge delete

2

Crap, I found the solution and typed in a fantastically insightful (and deeply emotional) answer, and lost it because as a new user I can't answer my own question for 2 days. Now you're stuck with this:

cv::ORB applies a GaussianBlur (about 20 lines from the end of orb.cpp) before computing descriptors. There is no way to control this through the public interface.

cv::gpu::ORB_GPU has a public member bool blurForDescriptor, which by default constructs as false. When I set it instead to true, I find that min/avg/max hamming distance drops to 0/7.2/30 bits, which seems much more reasonable.

Follow-on questions:

(a) Shouldn't cv::gpu::ORB_GPU default blurForDescriptor=true to match cv::ORB's (only) behavior?

( 2013-04-05 15:04:06 -0500 )edit
1

(b) It would be nice if ORB_GPU had an interface that allowed computing of descriptors only, from passed-in keypoints, like cv::ORB provides.

(c) The quantity and quality of ORB_GPU is still worse than ORB; now that I figured out about blur I'd ballpark more like 30% worse instead of 95% worse. Is this an understood behavior? Are there known mitigations?

I'm still interested in discussing this, please anybody comment with feedback. I won't be able to check again until Monday, but I'll be back to see if anybody provided any more insight.

cheers!

( 2013-04-05 15:06:50 -0500 )edit

Sort by » oldest newest most voted

Topic solved :)

more

You're a genius! Don't know why I didn't figger that out myself!

:)

( 2013-04-08 15:29:54 -0500 )edit

Hopefully by now the waiting period has passed and I can answer my own question.

cv::gpu::ORB_GPU::blurForDescriptor = true;

more

Also accept your answer so that the topic shows as solved :)

( 2013-04-08 13:02:58 -0500 )edit

OK thx, I clicked the "accept" button, but got an error ">50 points required to accept or unaccept your own answer to your own question.

Maybe you can do me the favor of submitting a stub answer that I can accept?

( 2013-04-08 13:51:08 -0500 )edit