Find image from a database of images

flann
python

asked 2017-09-28 17:31:01 -0600

Hello everyone, I'm moving my first steps with OpenCV in Python.
What I'd wish to do is given an image, find its "original" one from a collection of reference images. Just to be clear, the query image is a simple photo of the whole image (card), so it's not the scenario "find an object inside a photo", but "just" a similarity test.
My final database will be pretty large (about 25 000 images), but I started doing some tests on a smaller scale (only 270 images).
Recognition works perfectly, however it's pretty slow: it takes 8 seconds to iterate over all 270 images. I was able to speed up the job by saving the descriptors to disk and load them, instead of calculating them; anyway it's still slow.

So I started to work on FLANN: I get some results, but my main problem is to find the matching images. I get a whole array of points, but I don't know how to fetch the right image.
This is my code:

scanned = 'tests/temp_bw.png'
surf = cv2.xfeatures2d.SURF_create(400)
surf.setUpright(True)

img1 = cv2.imread(scanned, 0)
kp1, des1 = surf.detectAndCompute(img1, None)

FLANN_INDEX_KDTREE = 1
index_params = dict(algorithm=FLANN_INDEX_KDTREE, trees=5)

des_all = None
for filename in os.listdir('images'):
    img2 = cv2.imread('images/' + filename, 0)
    kp2, des2 = surf.detectAndCompute(img2, None)
    if des_all is None:
        des_all = des2
    else:
        des_all = np.concatenate((des_all, des2))

flann = cv2.flann.Index()
print "Training..."
flann.build(des_all, index_params)
print "Matching..."
indexes, matches = flann.knnSearch(des1, 10)
# and now???

Any suggestions on how I can reference back the most similar image?

edit retag flag offensive close merge delete

Comments

BOW

berak ( 2017-09-28 21:52:10 -0600 )edit

I think I got what you mean. However just to be sure: I want to know if the query image is very similar to train image X. To clarify: I want to know if the query image is Lord Of The Rings book, not a book instead of a hammer (since all images will be book). Should I create a category for each entry of my train set?

tampe125 ( 2017-09-29 01:15:31 -0600 )edit

your flann index does "unsupervised" clustering, it does not know about categories or class labels

(if you need that, rather use KNearest, or SVM)

berak ( 2017-09-29 01:41:59 -0600 )edit

also, your current code matches descriptors, not images (the way you do it now, you lose the information, which image any traindescriptors originally belonged to)

berak ( 2017-09-29 04:12:28 -0600 )edit

given the size of the database, what should be the best approach? Try BOW with KNearest or keep track of the descriptors-images relationship? I'm afraid the latter one will require an huge amount of disk space

tampe125 ( 2017-09-29 05:14:43 -0600 )edit

the bow idea would at least solve some problem: you calculate 1 fixed size bow feature vector per image (which is also needed for any kind of ml)

if you throw that at a flann index, you get image indices. (not feature indices)

you still will have to make up your mind, if you need categories, then you need something, that does classification

berak ( 2017-09-29 08:34:26 -0600 )edit

btw: https://github.com/opencv/opencv_cont...

berak ( 2017-09-30 01:31:56 -0600 )edit

add a comment

Find image from a database of images

Comments

Links

Question Tools

Stats

Related questions

Find image from a database of images edit

Comments

Links

Question Tools

Stats

Related questions

Find image from a database of images