DescriptorMatcher reports good matches even when object is not in the scene
I'm trying to detect a known object by comparing the current frame in a video against pre-stored feature descriptors. My thought is to match the features of the current frame against the prestored features, and report a positive detection when the number of good features is above some threshold.
This, however, doesn't seem to work because DescriptorMatcher
will always report a certain fixed number of matches regardless of whether the object is actually in the scene. Even though I use a pretty conventional filtering approach to keep the top x number of good matches, it's still a metric that's relative to the current frame.
Is there something like goodness of match from DescriptorMatcher
that I can potentially use as a hard threshold? Is there a better approach to this? I have used Bag of Words before but it seems a bit overkill for the problem at hand, also it's a bit too computational exhaustive for my needs. Any suggestions/tips/pointers would be appreciated.
ImageDescriptor imageDescriptor = null;
imageDescriptor = ImageDescriptor.fromJson(jsonMetadata);
Mat storedDescriptors = imageDescriptor.getFeatureDescriptors(); // prestored features
FeatureDetector featureDetector = FeatureDetector.create(FeatureDetector.ORB);
DescriptorExtractor descriptorExtractor = DescriptorExtractor.create(DescriptorExtractor.ORB);
DescriptorMatcher descriptorMatcher = DescriptorMatcher.create(DescriptorMatcher.BRUTEFORCE_HAMMING);
MatOfKeyPoint keyPoints = new MatOfKeyPoint();
featureDetector.detect(rgba, keyPoints); // rgba is the image from current video frame
MatOfDMatch matches = new MatOfDMatch();
Mat currDescriptors = new Mat();
descriptorExtractor.compute(rgba, keyPoints, currDescriptors);
descriptorMatcher.match(descriptors_scene, storedDescriptors, matches);
MatOfDMatch good_matches = filterMatches(matches); // filterMatches return the matches that have a distance measure of < 2.5*min_distance
if(good_matches.rows()>threshold)
return true;
return false;
it's a commorn misunderstanding, but descriptor-matching does not know anything about objects.
it just tries to find corresponding keypoints between 2 views of a scene,
in other words, to detect a certain kind of object, - this might be the wrong idea.