# Evaluation of Interest-Point-Detectors and Descriptors

I'm writing on a qualitative evaluation study of interest-point-detectors and descriptors. I've read all Mikolajczyk et al. papers as well as most of the surves from Datta et al. etc. Now I'm implementing an evaluation-tool with OpenCV. I'll take two images. One referred as source- and one as comparison-image.

1.) Detectors: Mikolajczyk uses the repeatability-criterion, correspondence count, matching score and an other metric to evaluate the performance of the detector. I would use repeatability and a correspondence count for matched regions.

2.) Descriptors: Here I would use the widely used Recall and Precision on the matched regions to describe the performance of the descriptor.

My question so far: Are these both good metrics for evaluation?

Now I'm trying to implement this is OpenCV and need a good eye from somebody to tell me if this code could do. Here I use SURF-Detector and -Descriptor for testing the metrics. Recall and Precision are not implemented yet, but there is a function calculating this.

#include <opencv.hpp>
#include "opencv2/core/core.hpp"
#include "opencv2/features2d/features2d.hpp"
#include "opencv2/nonfree/features2d.hpp"
#include "opencv2/highgui/highgui.hpp"
#include <stdio.h>
#include <stdlib.h>
#include <iostream>

int startSURF() {

std::cout << "Starting " << std::endl;

if (!sourceImage.data) {
std::cout << "Source-Image empty" << std::endl;
return -1;
} else if (!comparisonImage.data) {
std::cout << "Comparison-Image empty" << std::endl;
return -1;
}

//Detect keypoint with SURF
int minHessian = 400;
cv::Mat sourceMatchedImage, comparisonMatchedImage;
std::vector<cv::KeyPoint> sourceKeypoints, comparisonKeypoints;

cv::SurfFeatureDetector surfDetect(minHessian);
surfDetect.detect(sourceImage, sourceKeypoints);
surfDetect.detect(comparisonImage, comparisonKeypoints);

//Calculate the SURF-Descriptor
cv::SurfDescriptorExtractor surfExtractor;
surfExtractor.compute(sourceImage, sourceKeypoints, sourceMatchedImage);
surfExtractor.compute(comparisonImage, comparisonKeypoints,
comparisonMatchedImage);

//Flann-Matching
cv::FlannBasedMatcher flann;
std::vector<cv::DMatch> matches;
flann.match(sourceMatchedImage, comparisonMatchedImage, matches);

//Repeatability and Correspondence-Counter
float repeatability;
int corrCounter;
cv::Mat h12;

std::vector<cv::Point2f> srcK;
std::vector<cv::Point2f> refK;

for (int i = 0; i < matches.size(); i++) {
srcK.push_back(sourceKeypoints[matches[i].queryIdx].pt);
refK.push_back(comparisonKeypoints[matches[i].queryIdx].pt);
}

std::cout << "< Computing homography via RANSAC. Treshold-default is 3" << std::endl;
h12 = cv::findHomography( srcK,refK, CV_RANSAC, 1 );

cv::evaluateFeatureDetector(sourceImage, comparisonImage, h12,
&sourceKeypoints, &comparisonKeypoints, repeatability, corrCounter);

std::cout << "repeatability = " << repeatability << std::endl;
std::cout << "correspCount = " << corrCounter << std::endl;
std::cout << ">" << std::endl;

std::cout << "Done. " << std::endl;
return 0;
}


I'm uncertain if this code works because SURF gets bad repeatability (e.g. 0.00471577) for my testing images with a rotation of almost 45°. Does anybody see a problem with the code?

Is there a way to evaluate the detector without RANSAC? I did not find a yet implemented method for this. Is the default of 3 a good threshold? I could overwrite it but the problem is that a good threshold can only be determined by experimental results. But I need a robust default-value for all detectors.

I think I definitively need the homography. But I never found a way ...

edit retag close merge delete