Create features index (image database) and search (Python)
I have almost 3 million images of different objects. Now I'd like to build an index (database), storing the features of each image (SIFT or SURF). Using this index and a single image, I'd like to find the k most similar images (showing the exact same object). I have been able to extract features and compare single images to each other. But comparing image by image for such a huge database takes forever.
How would I build an index of all images and search this index?
This may be a trivial questions for all the pros here, but I am very new to OpenCV (Python bindings) and very grateful for any help :) Thank you!
"This may be a trivial question" - no, it definitely isn't ;)
Ok :) Any advice on how to build the index (FLANN for example)? I've read much about it but couldn't actually find any code example.
unfortunately, opencv's python bindings only have a FLANN matcher, not the FLANN index ;(From what I've read, Opencv's Python bindings certainly does have the FLANN index (flann_Index). But it is not documented like many other Opencv features.
ohh, sorry, you're right. for opencv3 this would be in cv2.flann. you could try:
Doesn't help :( In Opencv2.4 it's cv2.flann_Index:
Returns
But how do I add multiple images to this index? Do I simply have to concatenate the features of each image or is there an "add" function?
"Do I simply have to concatenate the features of each image " - yes, i think so. i've only tried the c++ version, but you will have to flatten/reshape each image to a single row, and build your feature matrix as a stack of those.
Can you post a snippet of your C++ code? Maybe that helps.
I've read both threads. But: When I add descriptors of each image
train the matcher
and search
how do I know, which of the images match the query image the most? How do I get some kind of ranking and the image names or IDs? Or is this just to compare one image to a group of images and it tells you how good the query image matches the whole group?
ah, sorry, those links were about flann matcher not the index.
in 3.0, index->knnSearch() returns indices and distances