Locality Sensitivy Hashing in OpenCV for image processing

asked 2016-05-19 04:28:12 -0600

lovaj
136 ●1 ●2 ●9

This is my first image processing application, so please be kind with this filthy peasant.

THE APPLICATION:

I want to implement a fast application (performance are crucial even over accuracy) where given a photo (taken by mobile phone) containing a movie poster finds the most similar photo in a given dataset and return a similarity score. The dataset is composed by similar pictures (taken by mobile phone, containing a movie poster). The images can be of different size, resolutions and can be taken from different viewpoints (but there is no rotation, since the posters are supposed to always be right-oriented).

Any suggestion on how to implement such an application is well accepted.

FEATURE DESCRIPTIONS IN OPENCV:

I've never used OpenCV and I've read this tutorial about Feature Detection and Description by OpenCV.

From what I've understood, these algorithms are supposed to find keypoints (usually corners) and eventually define descriptors (which describe each keypoint and are used for matching two different images). I used "eventually" since some of them (eg FAST) provides only keypoints.

MOST SIMILAR IMAGE PROBLEM AND LSH:

The problems above doesn't solve the problem "given an image, how to find the most similar one in a dataset in a fast way". In order to do that, we can both use the keypoints and descriptors obtained by any of the previous algorithms. The problem stated above seems like a nearest neighbor problem and Locality Sensitive Hashing is a fast and popular solution for find an approximate solution for this problem in high-dimensionality spaces.

THE QUESTION:

What I don't understand is how to use the result of any of the previous algorithms (i.e. keypoints and descriptors) in LSH.

Is there any implementation that treat this problem?

answered 2016-05-19 09:13:47 -0600

atrius.fox
16

I came here looking for answers on performance with LSH so maybe take this with a grain of salt :P

I'm doing essentially the same thing but with trading cards. Take a picture and lookup against 20,000 images to determine which card it is.

I use ORB and LSH.

This tutorial here is a good starting point but uses SURF http://romovs.github.io/blog/2013/07/...

The key differences you need to change are SURF detector to ORB detector, and the flannIndex needs to be created with LSHIndexParameters.

The relevant values or the number of descriptors you want to find (used for creation of the ORB detector, default is 500) and LshIndexParameters takes 3 values which I forget off the top of my head what they are.

I'm using Emgu CV C# library for open cv so I did some other alterations to make this work, but most of it was things like changing Matrix<float> to Mat, which you'll probably have to do because ORB uses binary descriptors and they are NOT floats, they are binary.

The concating part is easy with Mats since you can just use Mat.PushBack(Mat) to concat them the way you need.

Performance changes based on the number of descriptors and the lshindexparameter values but I don't understand much on how the indexparameters work (what I came here today to find). The orbdescriptors # you probably want to keep relatively low otherwise you'll be using too much RAM and it'll take a long time to do all the comparisons.

A couple other code corrections from that link are in the comments, and since you are using LSH you should cut out the distance check in theFindMatches method.

Edit:

I'm currently getting matches in under 3 seconds for a rotated/skewed image of a card against 20,000+ images on a laptop. I'm guessing this can be optimized but I'm still very new at this. Just wanted to give you an idea on expectations. Don't know what ballpark you're aiming for.

edit flag offensive delete link

Comments

First of all, thanks for your answer and for the link. From what I see, this code is implemented in C#: this, in my opinion, it's already a reason of bad performance compared to C++. In second place, FLANN use the generic and classic implementation of LSH, which is not the more performant. For example with this kernalized LSH implementation http://people.bu.edu/bkulis/klsh/klsh... the problem is solved in 0.6 seconds against 80 milions (yep, you're read correctly) tiny images. I'm reading the paper, but there are plenty and more recent implementations than FLANN. I think that if you want better performance, this is the right way.

lovaj ( 2016-05-20 02:11:13 -0600 )edit

Not arguing about C# performance. That was more just a matter of 'how to do this using opencv' and this example used C# emgucv (opencv wrapper). As I said, I'm still new to this and I recognize the possibility of optimzation, but as you said, that's more about replacing the pieces here with more efficient ones rather than changing how to go about it.

atrius.fox ( 2016-05-20 12:53:28 -0600 )edit

can you please send me the ORB and LSH code on this e-mail :) [email protected]

RR7 ( 2017-11-09 05:21:46 -0600 )edit

add a comment

Locality Sensitivy Hashing in OpenCV for image processing

1 answer

Comments

Links

Question Tools

Stats

Related questions

Locality Sensitivy Hashing in OpenCV for image processing edit

1 answer

Comments

Links

Question Tools

Stats

Related questions

Locality Sensitivy Hashing in OpenCV for image processing