(This is a cross-post from StackOverflow, but I wanted to post here also)
I’m currently working on building software which can match infrared and non-infrared images taken from a fixed point using a thermographic camera.
The use case is the following: A picture is taken from using a tripod of a fixed point using an infrared thermographic camera and a standard camera. After taking the pictures, the photographer wants to match images from each camera. There will be some scenarios where an image is taken with only one camera as the other image type is unnecessary. Yes, it may be possible for the images to be matched using timestamps, but the end-user demands they be matched using computer vision.
I've looked at other image matching posts on StackOverflow -- they have often focused on using histogram matching and feature detectors. Histogram matching is not an option here, as we cannot match colors between the two image types. As a result, I've developed an application which does feature detection. In addition to standard feature detection, I’ve also added some logic which says that two keypoints cannot be matching if they are not within a certain margin of each other (a keypoint on the far left of the query image cannot match a keypoint on the far right of the candidate image) -- this process occurs in stage 3 of the code below.
To give you an idea of the current output, here is a valid and invalid match produced -- note the thermographic image is on the left. My objective is to improve the accuracy of the matching process.
Valid match:
Invalid match:
Here is the code:
// for each candidate image specified on the command line, compare it against the query image
Mat img1 = imread(argv[1], CV_LOAD_IMAGE_GRAYSCALE); // loading query image
for(int candidateImage = 0; candidateImage < (argc - 2); candidateImage++) {
Mat img2 = imread(argv[candidateImage + 2], CV_LOAD_IMAGE_GRAYSCALE); // loading candidate image
if(img1.empty() || img2.empty())
{
printf("Can't read one of the images\n");
return -1;
}
// detecting keypoints
SiftFeatureDetector detector;
vector<KeyPoint> keypoints1, keypoints2;
detector.detect(img1, keypoints1);
detector.detect(img2, keypoints2);
// computing descriptors
SiftDescriptorExtractor extractor;
Mat descriptors1, descriptors2;
extractor.compute(img1, keypoints1, descriptors1);
extractor.compute(img2, keypoints2, descriptors2);
// matching descriptors
BFMatcher matcher(NORM_L1);
vector< vector<DMatch> > matches_stage1;
matcher.knnMatch(descriptors1, descriptors2, matches_stage1, 2);
// use nndr to eliminate weak matches
float nndrRatio = 0.80f;
vector< DMatch > matches_stage2;
for (size_t i = 0; i < matches_stage1.size(); ++i)
{
if (matches_stage1[i].size() < 2)
continue;
const DMatch &m1 = matches_stage1[i][0];
const DMatch &m2 = matches_stage1[i][3];
if(m1.distance <= nndrRatio * m2.distance)
matches_stage2.push_back(m1);
}
// eliminate points which are too far away from each other
vector<DMatch> matches_stage3;
for(int i = 0; i < matches_stage2.size(); i++) {
Point queryPt = keypoints1.at(matches_stage2.at(i).queryIdx).pt;
Point trainPt = keypoints2.at(matches_stage2.at(i).trainIdx).pt;
// determine the lowest number here
int lowestXAxis;
int greaterXAxis;
if(queryPt.x <= trainPt.x) { lowestXAxis = queryPt.x; greaterXAxis = trainPt.x; }
else { lowestXAxis = trainPt.x; greaterXAxis = queryPt.x; }
int lowestYAxis;
int greaterYAxis;
if(queryPt.y <= trainPt.y) { lowestYAxis = queryPt.y; greaterYAxis = trainPt.y; }
else { lowestYAxis = trainPt.y; greaterYAxis = queryPt.y; }
// determine if these points are acceptable
bool acceptable = true;
if( (lowestXAxis + MARGIN) < greaterXAxis) { acceptable = false; }
if( (lowestYAxis + MARGIN) < greaterYAxis) { acceptable = false; }
if(acceptable == false) { continue; }
//// it's acceptable -- provide details, perform input
matches_stage3.push_back(matches_stage2.at(i));
}
// output how many individual matches were found for this training image
cout << "good matches found for candidate image # " << (candidateImage+1) << " = " << matches_stage3.size() << endl;
I used this sites code as an example. The problem I’m having is that the feature detection is not reliable, and I seem to be missing the purpose of the NNDR ratio. I understand that I am finding K possible matches for each point within the query image and that I have K = 2. But I don’t understand the purpose of this part within the example code:
vector< DMatch > matches_stage2;
for (size_t i = 0; i < matches_stage1.size(); ++i)
{
if (matches_stage1[i].size() < 2)
continue;
const DMatch &m1 = matches_stage1[i][0];
const DMatch &m2 = matches_stage1[i][1];
if(m1.distance <= nndrRatio * m2.distance)
matches_stage2.push_back(m1);
}
Any ideas on how I can improve this further? Any advice would be appreciated as always.