Ask Your Question

Feature Matching; findObject: Concept behind MultiDetection

asked 2017-09-04 04:23:38 -0500

Franz Kaiser gravatar image

updated 2017-09-04 07:02:38 -0500

Hello everyone, for the feature-matching modul there is a great GUI-interface from IntRoLab on github.
Unfortunately I am not able to understand the idea behind the multidetection from the source code. The concept is descriped here:

With the outliners of the Ransac from findHomography another matching is performed.

I don't get the following about this concept: With multiple objects the scene, there are indeed a lot more keypoints from the sceen as from the object and also the number of descriptor variees. But if i do a nearest neighbor search I always got the same number of Features to track. This seems to be the logical as it is the purpose of this step, but makes the described concept of multidetection impossible, doesn't it? Also if i filter the descriptors (remove the ones, which are already used), I can't be sure that all are from one objects, can I?

It would be nice if someone could explain the idea behind this concept a bit more detailed (in steps). Thanks a lot!

edit: I tried to implement the reduction i tried to remove the keypoints of the scene as pointed out by StevenPuttemans but just one instance in the scene is found

    int main()
    std::string inputScene = "Scene.bmp";
    std::string inputObject = "Object.bmp";

    // LOAD IMAGES (as grayscale)
    cv::Mat objectImg = cv::imread(inputObject, cv::IMREAD_GRAYSCALE);
    cv::Mat sceneImg = cv::imread(inputScene, cv::IMREAD_GRAYSCALE);

    std::vector<cv::Point2f> objCorners(4), sceneCorners(4);
    objCorners[0] = cv::Point(0,0); 
    objCorners[1] = cvPoint( objectImg.cols, 0 );
    objCorners[2] = cv::Point( objectImg.cols, objectImg.rows ); 
    objCorners[3] = cvPoint( 0, objectImg.rows );
    cv::Mat showResImg;

    std::vector<cv::KeyPoint> objectKeypoints;
    std::vector<cv::KeyPoint> sceneKeypoints;
    cv::Mat objectDescriptors;
    cv::Mat sceneDescriptors;
    int minHessian = 400;
    cv::Ptr<cv::FeatureDetector> detector = cv::xfeatures2d::SURF::create( minHessian );
    detector->detect(objectImg, objectKeypoints);
    detector->detect(sceneImg, sceneKeypoints);

    int ind = 0;
        cv::Ptr<cv::DescriptorExtractor> extractor;
        extractor = cv::xfeatures2d::SIFT::create();
        extractor->compute(objectImg, objectKeypoints, objectDescriptors);
        extractor->compute(sceneImg, sceneKeypoints, sceneDescriptors);

        cv::Mat results, dists;
        int k=2;

        // Create Flann KDTree index
        cv::flann::Index flannIndex(sceneDescriptors, cv::flann::KDTreeIndexParams(), cvflann::FLANN_DIST_EUCLIDEAN);

        // search (nearest neighbor)
        flannIndex.knnSearch(objectDescriptors, results, dists, k, cv::flann::SearchParams() );

        // Find correspondences by NNDR (Nearest Neighbor Distance Ratio)
        std::vector<cv::Point2f> mpts_1, mpts_2; // Used for homography
        std::vector<int> indexes_1, indexes_2; // Used for homography
        std::vector<uchar> outlier_mask;  // Used for homography

        for(unsigned int i=0; i<objectDescriptors.rows; ++i)
            // Check if this descriptor matches with those of the objects
            // Apply NNDR
            float nndrRatio = 0.8f;
            if(<int>(i,0) >= 0 &&<int>(i,1) >= 0 &&<float>(i,0) <= nndrRatio *<float>(i,1))


        int nbMatches = 8;
        if(mpts_1.size() >= nbMatches)
            cv::Mat H = findHomography(mpts_1,
            std::cout << "H: " << H << std::endl;

            // Do what you ...
edit retag flag offensive close merge delete

1 answer

Sort by ยป oldest newest most voted

answered 2017-09-04 05:00:54 -0500

In principle the feature matchers in OpenCV look,using ransac, at the largest bundle of matches that can be grouped. Thus this means that basically you can only match one object at ones. What they do is remove those matched keypoints in the scene and then do the feature matching again, hopefully finding the second identical object. Ofcourse you need to start from a large set of features. The number of returned features is not fixed, but by default limited to 500, but can be changed.

edit flag offensive delete link more


thank you for your help, i tried to implement your approach (see code in the original question) - but either i have a misunderstood your answer, or made a mistake, because even if i remove the keypoints from scene the same instance is found over and over again. At which point does the spartial grouping of the features take place? I always thought that just the distance of the descriptors and not the position are taken into account.

Franz Kaiser gravatar imageFranz Kaiser ( 2017-09-04 06:58:22 -0500 )edit

I guess you misunderstood. Your reference image has for example 500 features, why you detect 20.000 features on your scene. Then you perform matching and the first result is found. You now remove the X lets say 250 matches from your scene. Run matching again and now you have 20.000 - 250 possible matches. In this case you never return the same match again.

RANSAC applied not only distance but also looks for a uniform displacement between origin and scene.

StevenPuttemans gravatar imageStevenPuttemans ( 2017-09-04 07:04:50 -0500 )edit

Thanks again. At the moment I:

  1. Extract the features as KeyPoints of scene (2500) and object (110)
  2. compute the descriptors of Scene (2500) and Object (110)
  3. Apply NNDistanceRatio to scene and object (526) and (526)
  4. Compute Homography with RANSAC with the 526 resulting points from step 4 and save in a mask which points are outliners
  5. Remove all inliners from scene feature points and repeat at step 2

with your approach how should i identify the KeyPoints i have to remove?

Franz Kaiser gravatar imageFranz Kaiser ( 2017-09-04 07:33:07 -0500 )edit

Question Tools

1 follower


Asked: 2017-09-04 04:23:38 -0500

Seen: 242 times

Last updated: Sep 04 '17