Can we use Bag of Visual Words to compute similarity between images directly?
I'm implementing a Content Based Image Retrieval application (CBIR).
I've read about the Bag of Features model and it's considered an intermediate-step algorithm in some application. For example, the histograms generated could be used for SVM classification.
Since the produced vectors are affected by the curse of dimensionality, it can become expensive to compute some kind if distance between a query histogram and all the dataset histograms. For this reason, techniques like LSH have been implemented for datasets with hundreds of thousands (or millions) of images.
However, my system is based on (let's say) 50k images, so compute directly the distance should not be so prohibitive.
I have two questions:
- Is this approach reasonable? Recap of it: classic BoF approach and then compute the distance between each dataset histogram and the query histogram. The smaller one is returned as the most similar image.
- Which distance should I use? I've heard that chi-squared distance is a popular choice. What about euclidean, cosine similarity etc? Which one is the best in such a context?
i don't quite understand, what you mean by "directly". does it mean without bow clustering, or using the image pixels instead of derived features like SIFT ?
I tried to explain it in the question, I think that I was not clear enough, I'm sorry for that. What I mean is: can we use the histograms generated by BoF to compute the similarity between two images through some distance metric (which one is actually question 2.)? For example: we have 50k images. We apply the BoF algorithm. Now we have 50K histograms vectors in
k
dimensions (wherek
is the k-means parameter). My question is: given a query imageq
, can I compareq'
s histogram with all the other 50k histograms in order to find the most similar image in the dataset? If yes, which distance metric to use (point 2.)? I'm sorry if I was not clear even this time.ah, sorry, i wasn't quite awake yet ;)
so, - more like using a "manual" nearest neighbour search (e.g. compareHist(a,b,HISTCMP_CHISQR); in a loop) instead of applying an SVM or KNearest ?
i don't think, there's a canonical answer to this, you have to try (and no stone must be left unturned..)
SVM? I never used it (I'm not really a machine learning guy :) ) but I know that's for classification problem, not CBIRs (I think they're quite different domains). Just as curiosity! Btw, I'll try to implement a linear scan implementation on histograms and come back with results! Most important: what distance should I use to compute the similarity between histograms?
ah, right, SVM / KNearest was wrong guess, it's not classification. (but maybe a flann::Index would do)
just try - but i guess, CORREL, INTERSECT and KL_DIV will be pretty bad, and CHISQR and HELLINGER quite ok.