Ask Your Question

BRISK+FREAK: Which Matcher and which "correct" filter logic to use

asked 2014-06-20 12:33:11 -0600

AGalluccio gravatar image

updated 2014-06-20 12:42:36 -0600

Hi Guys, I really need you help please.

Before i write this post, i've read many tutorials, mades different searches, and try to understand all the concepts involved in computer vision. I never heard about Computer Vision before, so i'm a newbie in this field.

Let's start from explain the specific problem. I would like to know how much similar two images are. I know there are a lot of posts/info/tutorials/sites about that, and i swear i've already read it all! For example i've started from these links (i think that somes of them, could be interesting for all of you):

And some code Examples:

Specifically: I have a photo of a dog: first photo is only the face of the dog, the second one is the same dog but laied on a green field. For me, the images are similar (is the same dog).

Then i have a photo with another dog face. It's a different dog, and i would like to filter this photo like "no similar" (before answer "use BOW", please continues in reading).

The OpenCV release is the last (2.4.9)

I need help on SPECIFIC combinations of descriptor/extractor/matcher: I use BRISK as detector (because it's light and free) I use FREAK as descriptor (becuse it's light - more than BRISK descriptor, because it's free and because it's a float descriptor like BRISK).

I've read that other good combinations are: ORB+ORB & BRISK+BRISK (I've never tried the first combination, but the second it's slow than BRISK+FREAK)

Then i tried different kind of matchers: - BFMatcher (with CrossCheck = true (and no filter logic) & CrossCheck = false + filter logic) - Flann Matcher

For the first one (BF) i've try both match & knnMatch methods For the second one i've try only knnMatch method with different filter logic.

The problems is that different filter logics "sometimes" not works for all combinations of this photoset of two dogs. For example: one filter logic works good to find the same dog in photos, but doesen't works if i try to match the face of the first dog with the second one (there are two differen dogs!).

What i understand (or what i thik i've understand) is that for each detector there is a "better" matcher (BF/Flann). And for each matcher (used with previous combination) there is only ONE "good way" to detect the right match from the knnMatch method.

So the question: which matcher i should use with ... (more)

edit retag flag offensive close merge delete



First of all - well done! You've done a lot of research by yourself and searched the web thoroughly before asking for advice.

Second, in my opinion, for the dogs problem, you would need something much more powerful than matching keypoints. If you would upload some example images, it will be easier to help.

By the way, FREAK and BRISK are not floating point descriptors.

GilLevi gravatar imageGilLevi ( 2014-06-20 13:21:20 -0600 )edit

Hi Gil, i've wrote a long answer to explain better the problem and answer to you & Guanta both, but the forums doesen't allow to post an answer before 48h. A very long post loosed for two times! So i post a comment instead. You're right: just one image, are better that 1000 words. So in the next days, i will modify my code to produce the matched image (with drawMatches method) to explain better the problem i have. And i would like to thank you also, for your great blogs (i suppose it's yours i.e: that help me to understand better some very important concepts. Your support is very important for people like me that try to learn and understand new concepts by their own. Thx again! Andrea

AGalluccio gravatar imageAGalluccio ( 2014-06-22 10:12:12 -0600 )edit

Those are indeed my posts, thank you for your kind words. Again, I would recommend uploading the input images, not just the drawMatches results, so we could better understand the problem.

GilLevi gravatar imageGilLevi ( 2014-06-22 10:22:47 -0600 )edit

3 answers

Sort by ยป oldest newest most voted

answered 2014-06-20 13:51:09 -0600

Guanta gravatar image

As GilLevi pointed out, FREAK and BRISK are binary descriptors. If you want to use BoW (and I guess this would be the easiest and most accurate way) than you'll need to make some extra effort to work with binary descriptors, see Note that for the BoW-descriptor (i.e. the histograms) you will also change the nearest neighbor matcher to use Hamming-Distance. However, in most image classification tasks float descriptors like SIFT (typically densely sampled) are used for BoW, so is a large speed really required? What is the actual application?

edit flag offensive delete link more


Hi Guanta, you're right: are both binary (and not float - it's important to specifiy it, in order to avoid possible misunderstanding for newbies like me :-). I would like thank you also for your help. For this i would like to share with you a very vell done ppt tutorial about BOW (that i found during my long searches):

AGalluccio gravatar imageAGalluccio ( 2014-06-22 10:28:52 -0600 )edit

So i understand very well that i have a problem about "classification" and the only good solution is BOW. But BOW works with float extractor (i understand) and ORB is binary (like BRISK & FREAK). Binary are more fast that float detector (i read). So probably is better use fload descriptor/extractor and use BOW. I'll do some other searches about that (i need algorithm that are not patented because i would like to develop a commercial software). Thank a lot again. I'll post my results as soon as i will able to share my experience with all of you Andrea

AGalluccio gravatar imageAGalluccio ( 2014-06-22 10:35:03 -0600 )edit

Well, actually only a specific part of SIFT is patented and it also depends on the country, so maybe the patent for you doesn't hold at all? Otherwise you may want to check out KAZE (another very good descriptor) of the opencv master branch (github) which doesn't have a patent and which has floating point precision.

Guanta gravatar imageGuanta ( 2014-06-22 13:13:49 -0600 )edit

Hi Guanta, thx a lot for the hint. I'm think the same thing but with AKAZE (that should be part of OpenCV 2.4.9, am i right?). And what you think about keeping BRISK+FREAK,FLANN (to keep speed and light) the convert descriptors to a float CV_32F and then in uchar (as described in this post: ) and then use these with BOW (to train index and make the match)? Before write implementation, i would like to know what is your think on this way to proceed. I know that could be faster to implement KAZE (or AKAZE) but i read that the matching precision is not equivalent like BRISK or FREAK. Probably i understand wrong?

AGalluccio gravatar imageAGalluccio ( 2014-06-22 15:50:27 -0600 )edit

Reading this (your) post what i understand is that a "must-have" to build BOW, is have descriptors in uchar format and then compute manually the Cluster-Centers on that. I read a couple of pdf that describes how to do that but i don't understand too much (too mathematics behind for me...) Could suggest me some likes with pseudo-code of Cluster-Centers (then i will use specific language to implement that). Thanks again Andrea

AGalluccio gravatar imageAGalluccio ( 2014-06-22 15:57:37 -0600 )edit

No (A)KAZE is not part of OpenCV 2.4.9 but of OpenCV's master branch ( Also note AKAZE is the binary fast version of KAZE (which is a float descriptor). If I were you, I'd get the newest OpenCV code from github, use KAZE and then you can use OpenCV's BOWTrainer and BOWDescriptorExtractor without any modifications. Where did you read that KAZE's precision is worse than FREAK or BRISK? shows that KAZE performs well (however not tested in an image classification task). The link you posted is not my post and no, if you have binary descriptors you need a special handling, since than each norm should be a Hamming norm, in contrast to float based descriptors where you can use L2 norm.

Guanta gravatar imageGuanta ( 2014-06-23 05:02:50 -0600 )edit

Thanks a lot again Guanta for your kind hints and explanations. I will update this post after i've made some tests with this "bag of informations". Andrea

AGalluccio gravatar imageAGalluccio ( 2014-06-23 09:55:07 -0600 )edit

answered 2014-06-28 15:05:00 -0600

AGalluccio gravatar image

Hi guys, during this week i've made some other tests. The last step i've done was BOW implementation (but i stopped as soon as i read the answer of Guanta about binary descriptors & BOW). I've made some searches on BBOW (Binary Bag of word); i found some good pdfs, but not a code implementation of BBOW.

So i've change my code from BRISK+FREAK+FLANN to ORB+ORB+BFMACTHER(NORM_HAMMING) as described in this good post of stack overflow trying to have (with high hopes) better results

But i have no success (sigh, sob). So i think that the best way to proceed is taking the suggestion of Guanta: download master branch of OpenCV (rel 3.0) and use KAZE+BOW I've load an image that collects all images i'm working on for find image similarities. I've grouped images that for me are similar with a border with different color (for me similar means "same subject in different photos").

The best results i got (BEFORE ORB implementation) was that the dog of set1 are correctly matched with himself, but it's matched with one photo of set2 also. And if i use ONLY the three photos of set2 and try to match between them, i never find a good match for anyone (but the subject is the same)

I have better results with images in set 3 and 4, but for the set 3 images, not all images are matched correctly (this means that one of them is not selected as similar). So the question: is KAZE+BOW the better (& only) way to proceed in order to find similar images? Or probably what i'm trying to achieve is not possible?

Thanks a lot again Andrea

edit flag offensive delete link more


It's very good that you finally uploaded some example images, now the problem is much clearer. Do you use color information or operate on the grayscale images? In any case, for such a hard problem I would recommend using a stronger approach - Deep convulutional neural nets for example.

GilLevi gravatar imageGilLevi ( 2014-06-28 16:00:52 -0600 )edit

Hi Gil, all images are of the same size (200X200) (i resize them before start matching) and in grayscale format (converted using std OpenCV method for grayscale). I never read about " Deep convulutional neural nets". are there an OpenCV implementation of such techinque? Or i have to implement it on Knn match results matrix ? Thanks Andrea

AGalluccio gravatar imageAGalluccio ( 2014-06-29 03:22:46 -0600 )edit

Try incorporating color - I think the OppentDescriptorExtractor can be used for that. In my experiments with BOW, color helped a lot. If it won't work, then perhaps you should try deep CNN's (I'll send you some links if you want). + wait for some more comments by other people

GilLevi gravatar imageGilLevi ( 2014-06-29 06:58:02 -0600 )edit

Sry GilLevi but I disagree here, deep CNN's are indeed a very powerful feature/classification setting but it very much depends on the amount of training data! If you don't have a huge training set (like Pascal or ImageNet) for training, then I doubt that a classical BoW is worse than deep CNNs are. Anyway, BoW is much easier to implement, so why not just try it? Once you have the basic system running you can then try to improve it, e.g. by using color features as GilLevi suggested.

Guanta gravatar imageGuanta ( 2014-06-29 16:18:53 -0600 )edit

Thanks for your comment @Guanta. There some off the shelf networks which can be used to extract powerful features through the use of deep CNN's (for example: Decaf, Caffe, OverFeat), so he doesn't actually need to train a deep CNN (which indeed needs a lot of training data). He can just use the "off-the-shelf" CNN's to extract features and use those features to train an SVM for example. That's what I did on the CIFAR benchmark and got really nice results.

GilLevi gravatar imageGilLevi ( 2014-06-30 07:25:59 -0600 )edit

@GilLevi: ahh nice, good point!

Guanta gravatar imageGuanta ( 2014-06-30 12:19:31 -0600 )edit

answered 2014-07-06 06:28:03 -0600

AGalluccio gravatar image

Thx a lot to you both guys, you helped me a lot. This app will run on iphone, so i think that CNN is too much (it's like open a door with a bomb :-)). So the next steps i do, will be try to build the IOS framework with the latest version of OpenCV (3.0)(i've already read that it is not so "simply") and the try KAZE+BOW combination. If i will not have good results with this techique also, then i will "put my hands up", and i'll keep the "less worst" combination of descriptor/extractor/matcher.

I will update this post for sure, just for share my experience with all of you, but i'll do in next months (i'm programming the app in the "free" time, that is no so much).

'till then, thx a lot again for your support. Andrea

edit flag offensive delete link more


Have you tried adding color to BOW?

GilLevi gravatar imageGilLevi ( 2014-07-06 07:23:29 -0600 )edit

By the way, there's an SDK for CNNs that runs on Iphone:

GilLevi gravatar imageGilLevi ( 2014-07-06 07:24:30 -0600 )edit

@GillLevi. Not yet. But i try when i'll use kaze descriptor @Guanta. I'll give a look for sure in case bow (w/out colors) will not give good results. But cnn can be used with binary descriptors/extractors?

AGalluccio gravatar imageAGalluccio ( 2014-07-06 16:35:22 -0600 )edit

No, CNN's are a whole different thing.

GilLevi gravatar imageGilLevi ( 2014-07-06 16:55:04 -0600 )edit

Question Tools



Asked: 2014-06-20 12:33:11 -0600

Seen: 4,861 times

Last updated: Jul 06 '14