TrackWithOpticalFlow Alternatives

asked 2016-11-29 03:19:44 -0500

Vintez gravatar image

updated 2016-11-30 02:53:56 -0500

EDIT: Added Beraks suggestion of downscale/upscale. EDIT: Added additional Changes, Inspired by Tetragram.

Inspired by the Mathworks Tutorial for Augmented Reality I wanted to create a similar Application for Android, where the Recognition and Tracking are implemented.

After some research I've seen, that the mentioned pointTracker from Mathworks uses the KLT-Algorithm which is also implemented in OpenCV with calcOpticalFlowPyrLK I've implemented the algorithm, which takes the last frame, where I recognized my Points and try to estimate their new position in the current frame with this method:

int BruteForceMatcher::trackWithOpticalFlow(std::vector<cv::Mat> prevPyr, std::vector<cv::Mat> nextPyr, std::vector<cv::Point2f> &srcPoints, std::vector<cv::Point2f> &srcCorners){

std::vector<cv::Point2f> estPoints;
std::vector<cv::Point2f> estCorners;
std::vector<cv::Point2f> goodPoints;
std::vector<cv::Point2f> leftsrc;
std::vector<uchar> status;
std::vector<float> error;

if(srcPoints.size() > 0) {

    cv::calcOpticalFlowPyrLK(prevPyr, nextPyr, srcPoints, estPoints, status, error);

    for (int i = 0; i < estPoints.size(); i++) {
        if (error[i] < 20.f) {
            //LOGW("ERROR : %f\n", error[i]);
            //upscaling of the Points
            goodPoints.push_back(estPoints[i] *= 4);
            leftsrc.push_back(srcPoints[i] *= 4);

    //LOGD("Left Points (est/src): %i, %i", goodPoints.size(), leftsrc.size());

    if(goodPoints.size() <= 0){
        //LOGD("No good Points calculated");
        return 0;
    cv::Mat f = cv::findHomography(leftsrc, goodPoints);

    if(cv::countNonZero(f) < 1){
        //LOGD("Homography Matrix is empty!");
        return 0;

    cv::perspectiveTransform(srcCorners, estCorners, f);


    return srcPoints.size();

return 0;


And the Method which will be called through a JNICALL:

std::vector<cv::Point2f> findBruteForceMatches(cv::Mat img){

int matches = 0;
std::vector<cv::Point2f> ransacs;
BruteForceMatcher *bruteForceMatcher = new BruteForceMatcher();
double tf = cv::getTickFrequency();


    std::vector<cv::Mat> nextPyr;
    cv::resize(img, img, cv::Size(img.cols/4, img.rows/4));
    cv::buildOpticalFlowPyramid(img, nextPyr, cv::Size(8,8), 3);        

    double kltTime = (double) cv::getTickCount();

    matches = bruteForceMatcher->trackWithOpticalFlow(prevPyr, nextPyr, srcPoints, scene_corners);

    kltTime = (double) cv::getTickCount() - kltTime;
    LOGD("KLT Track Time: %f\n", kltTime*1000./tf);

    if(matches > 10){
        trackKLT = true;
        delete bruteForceMatcher;
        return scene_corners;

        trackKLT = false;
        delete bruteForceMatcher;
        return scene_corners;

} else{
    double bfMatchTime = (double) cv::getTickCount();

    matches = bruteForceMatcher->findMatchesBF(img, features2d, descriptors, scene_corners, ransacs);

    bfMatchTime = (double) cv::getTickCount() - bfMatchTime;
    LOGD("BruteForceMatch Time: %f\n", bfMatchTime*1000./tf);
    if(matches > 3){

        trackKLT = true;
        cv::resize(img, img, cv::Size(img.cols/4, img.rows/4));
        cv::buildOpticalFlowPyramid(img, prevPyr, cv::Size(8,8), 3);

        for(int i = 0; i < ransacs.size(); i++){
            ransacs[i] *= 0.25;
        delete bruteForceMatcher;
        return scene_corners;

        delete bruteForceMatcher;
        return scene_corners;


Unfortunately this method runs only at 200 ms (~5 Fps) which is to slow for my Application. Is there any other similar algorithm, which could track a couple of points in a Image? Or is there a way, to speed up my algorithm?

In a paper I read, that they ... (more)

edit retag flag offensive close merge delete



is there a way, to speed up my algorithm?

resizing by 2 will make it 4 times faster

how do you get your src points ?

berak gravatar imageberak ( 2016-11-29 03:49:37 -0500 )edit

In the recognition phase. There I detect KeyPoints with Fast from the current Blurred Frame and calculate the Descriptors of ~100 KeyPoints. After that it matches them with the openCV BruteForceMatcher and removes Outliers through Distance and Ransac. The left Points are passed to a global variable -> srcPoints. To your comment, with resizing you mean the frame and the prevFrame? After the calculation I would need to transform the coordinates back or not?

Vintez gravatar imageVintez ( 2016-11-29 03:54:34 -0500 )edit

in theory, there's an overload to calcOpticalFlowPyrLK, that applies images pyramids, (so you only have to calculate a new pyramid for the current frame, not for both), but no idea, if that works in java

yes, you have to downscale both frames and input points, and later upscale the output points again.

berak gravatar imageberak ( 2016-11-29 03:58:12 -0500 )edit

Actually I dont build a Image Pyramid, I understood the method, that it could either take a 8-Bit Image or a Image Pyramid. So I dont build a Pyramid for both images. Also I use native c++ on Android, Java is only involved to pass the frame and receive the calculated scene_corners. Also i'm not quite sure, how to downscale the input Points and upscale the output again. Already found that.

Vintez gravatar imageVintez ( 2016-11-29 04:06:00 -0500 )edit

With your suggestion, I receive a result of 100ms (~10fps) far better, but not what I have to aim for. My supervisor wants at least ~25 - 30 fps. I put your suggestions into the code of the Question.

Vintez gravatar imageVintez ( 2016-11-29 04:51:29 -0500 )edit
  • needs more profiling ;)
  • maybe already find keypoints/descriptors on half size ?
  • do you really need to track every frame ? (you just could skip some)
berak gravatar imageberak ( 2016-11-29 06:00:53 -0500 )edit
  • profiling...? Seems my english is to bad here ;D
  • on Tracking phase, I don't detect KeyPoints/Descriptors, I just take them from recognition I'll add the Recognition part in my Question so you might understand what i mean.
  • As is understand the ImageReader, I already do that: My ImageReader is bound to the previewsurface. It gets an Image on every Frame available. When started first, it opens a AsyncTask which does the processing. While processing, the ImageReader still gets Images, which are closed immediantly, until my AsyncTask finished.
Vintez gravatar imageVintez ( 2016-11-29 06:15:29 -0500 )edit

profiling: taking times for some steps in your algo (see, how much the kp finding costs, vs the flow)

berak gravatar imageberak ( 2016-11-29 06:29:26 -0500 )edit

Ah ok, as said, there is no detection while the OpticalFlow Tracker is active. Besides that, I tried to seperate the measurements in logical parts and get the following result Code Part marked beside it.

  • Packing YUV buffers into 1 byte[] [source]( 22,91ms
  • read byte[] and convert to grayscale (similar to above but other enum) (c++) 23,94ms
  • Use of calcOpticalFlowPyrLK (c++) 55,01 ms
  • findHomography (c++) 0,4 ms
  • perspectiveTransform (c++) 0,01 ms
  • returning Result to java (c++) ~3-4 ms

Overall time of that case was 107ms. The Results reproduce with small adjustments over multiple usages. I'm not quite sure, but it seems, that the conversion and calculation are the bottlenecks of the Algorithm.

Vintez gravatar imageVintez ( 2016-11-29 07:08:25 -0500 )edit

Well again I tried to shrink the speed needed by the tracking algorithm. now im close to ~20 fps (in good cases!) I modified the parameters of the calcOpticalFlowPyrLK and set the windowSize to cv::Size(8,8) which still gives good results and accelerates the algorithm a little bit. Seems, that only my conversion could be optimized now, but I don't know, how i could achieve that. Could I resize the Mat before conversion? Or would I lose to much information?

Vintez gravatar imageVintez ( 2016-11-29 09:58:21 -0500 )edit