TrackWithOpticalFlow Alternatives
EDIT: Added Beraks suggestion of downscale/upscale. EDIT: Added additional Changes, Inspired by Tetragram.
Inspired by the Mathworks Tutorial for Augmented Reality I wanted to create a similar Application for Android, where the Recognition and Tracking are implemented.
After some research I've seen, that the mentioned pointTracker from Mathworks uses the KLT-Algorithm which is also implemented in OpenCV with calcOpticalFlowPyrLK
I've implemented the algorithm, which takes the last frame, where I recognized my Points and try to estimate their new position in the current frame with this method:
int BruteForceMatcher::trackWithOpticalFlow(std::vector<cv::Mat> prevPyr, std::vector<cv::Mat> nextPyr, std::vector<cv::Point2f> &srcPoints, std::vector<cv::Point2f> &srcCorners){
std::vector<cv::Point2f> estPoints;
std::vector<cv::Point2f> estCorners;
std::vector<cv::Point2f> goodPoints;
std::vector<cv::Point2f> leftsrc;
std::vector<uchar> status;
std::vector<float> error;
if(srcPoints.size() > 0) {
cv::calcOpticalFlowPyrLK(prevPyr, nextPyr, srcPoints, estPoints, status, error);
for (int i = 0; i < estPoints.size(); i++) {
if (error[i] < 20.f) {
//LOGW("ERROR : %f\n", error[i]);
//upscaling of the Points
goodPoints.push_back(estPoints[i] *= 4);
leftsrc.push_back(srcPoints[i] *= 4);
}
}
//LOGD("Left Points (est/src): %i, %i", goodPoints.size(), leftsrc.size());
if(goodPoints.size() <= 0){
//LOGD("No good Points calculated");
return 0;
}
cv::Mat f = cv::findHomography(leftsrc, goodPoints);
if(cv::countNonZero(f) < 1){
//LOGD("Homography Matrix is empty!");
return 0;
}
cv::perspectiveTransform(srcCorners, estCorners, f);
srcCorners.swap(estCorners);
srcPoints.swap(goodPoints);
status.clear();
error.clear();
return srcPoints.size();
}
return 0;
}
And the Method which will be called through a JNICALL:
std::vector<cv::Point2f> findBruteForceMatches(cv::Mat img){
int matches = 0;
std::vector<cv::Point2f> ransacs;
BruteForceMatcher *bruteForceMatcher = new BruteForceMatcher();
double tf = cv::getTickFrequency();
if(trackKLT){
LOGD("TRACK WITH KLT");
std::vector<cv::Mat> nextPyr;
cv::resize(img, img, cv::Size(img.cols/4, img.rows/4));
cv::buildOpticalFlowPyramid(img, nextPyr, cv::Size(8,8), 3);
double kltTime = (double) cv::getTickCount();
matches = bruteForceMatcher->trackWithOpticalFlow(prevPyr, nextPyr, srcPoints, scene_corners);
kltTime = (double) cv::getTickCount() - kltTime;
LOGD("KLT Track Time: %f\n", kltTime*1000./tf);
if(matches > 10){
trackKLT = true;
prevPyr.swap(currPyr);
delete bruteForceMatcher;
return scene_corners;
}else{
trackKLT = false;
prevPyr.clear();
srcPoints.clear();
scene_corners.clear();
delete bruteForceMatcher;
return scene_corners;
}
} else{
LOGD("RECOGNIZE OBJECT");
double bfMatchTime = (double) cv::getTickCount();
matches = bruteForceMatcher->findMatchesBF(img, features2d, descriptors, scene_corners, ransacs);
bfMatchTime = (double) cv::getTickCount() - bfMatchTime;
LOGD("BruteForceMatch Time: %f\n", bfMatchTime*1000./tf);
if(matches > 3){
trackKLT = true;
cv::resize(img, img, cv::Size(img.cols/4, img.rows/4));
cv::buildOpticalFlowPyramid(img, prevPyr, cv::Size(8,8), 3);
for(int i = 0; i < ransacs.size(); i++){
ransacs[i] *= 0.25;
}
srcPoints.swap(ransacs);
delete bruteForceMatcher;
return scene_corners;
}else{
img.release();
scene_corners.clear();
ransacs.clear();
delete bruteForceMatcher;
return scene_corners;
}
}
}
Unfortunately this method runs only at 200 ms (~5 Fps) which is to slow for my Application. Is there any other similar algorithm, which could track a couple of points in a Image? Or is there a way, to speed up my algorithm?
In a paper I read, that they ...
resizing by 2 will make it 4 times faster
how do you get your src points ?
In the recognition phase. There I detect KeyPoints with Fast from the current Blurred Frame and calculate the Descriptors of ~100 KeyPoints. After that it matches them with the openCV BruteForceMatcher and removes Outliers through Distance and Ransac. The left Points are passed to a global variable ->
srcPoints
. To your comment, with resizing you mean the frame and the prevFrame? After the calculation I would need to transform the coordinates back or not?in theory, there's an overload to calcOpticalFlowPyrLK, that applies images pyramids, (so you only have to calculate a new pyramid for the current frame, not for both), but no idea, if that works in java
yes, you have to downscale both frames and input points, and later upscale the output points again.
Actually I dont build a Image Pyramid, I understood the method, that it could either take a 8-Bit Image or a Image Pyramid. So I dont build a Pyramid for both images. Also I use native c++ on Android, Java is only involved to pass the frame and receive the calculated scene_corners.
Also i'm not quite sure, how to downscale the input Points and upscale the output again.Already found that.With your suggestion, I receive a result of 100ms (~10fps) far better, but not what I have to aim for. My supervisor wants at least ~25 - 30 fps. I put your suggestions into the code of the Question.
profiling: taking times for some steps in your algo (see, how much the kp finding costs, vs the flow)
Ah ok, as said, there is no detection while the OpticalFlow Tracker is active. Besides that, I tried to seperate the measurements in logical parts and get the following result Code Part marked beside it.
calcOpticalFlowPyrLK
(c++) 55,01 msfindHomography
(c++) 0,4 msperspectiveTransform
(c++) 0,01 msOverall time of that case was 107ms. The Results reproduce with small adjustments over multiple usages. I'm not quite sure, but it seems, that the conversion and calculation are the bottlenecks of the Algorithm.
Well again I tried to shrink the speed needed by the tracking algorithm. now im close to ~20 fps (in good cases!) I modified the parameters of the
calcOpticalFlowPyrLK
and set the windowSize tocv::Size(8,8)
which still gives good results and accelerates the algorithm a little bit. Seems, that only my conversion could be optimized now, but I don't know, how i could achieve that. Could I resize the Mat before conversion? Or would I lose to much information?