Ask Your Question

Results of objection detection using SURF+homography

asked 2014-01-19 23:51:11 -0500

Steve7 gravatar image

updated 2014-01-19 23:51:52 -0500

Hi I was trying to detect an object in a video feed. So far I have explored Template Matching but the results wasn't satisfactory as the video feed may or may not contain the object and is subjected to rotation and differing scale. Furthermore I have no time to train a cascade classifier or sorts, therefore I have stumbled upon SURF+homography approach from this link:

The results was good but what I wanted to "get" from my program is simply an answer to "is the object currently present in the video?" (Present/Absent). So how can I get this answer using the SURF+homography approach?

edit retag flag offensive close merge delete


Could you explain why training a cascade isn't possible? I mean, a basic cascade takes only an hour or so to train of you use LBP features and it could already provide some good results.

StevenPuttemans gravatar imageStevenPuttemans ( 2014-01-20 04:21:39 -0500 )edit

@StevenPuttemans Oh I didn't realize it would take such short time as from what I read it requires hours, even days to train a classifier. However would you mind pointing me a direction to, perhaps a tutorial on training a LBP cascade in a short time? Thanks a lot.

Steve7 gravatar imageSteve7 ( 2014-01-20 04:36:44 -0500 )edit

That is if your data contains sets of 5000 pos 5000 negatives and you use for example HAAR wavelets. Just start by using 500 pos and 500 negatives, don't set your required precision to high and it will work quite fast. I always use LBP for prototyping.

As for examples, just search the forum, it is filled with topics concernings this cascade classifier training. The official one is found here. Just select the correct features!

StevenPuttemans gravatar imageStevenPuttemans ( 2014-01-20 04:43:24 -0500 )edit

In fact this is one of the reasons I chose not to use a classifier, the efforts in locating "good" positive samples and negative samples. Anyway thanks for the suggestion, I'll try it if I have more time.

Steve7 gravatar imageSteve7 ( 2014-01-20 05:02:24 -0500 )edit

Yeah it is indeed the bottleneck of the actual algorithms using these positive and negative training sets...

StevenPuttemans gravatar imageStevenPuttemans ( 2014-01-20 05:59:05 -0500 )edit

3 answers

Sort by ยป oldest newest most voted

answered 2014-01-20 00:30:08 -0500

Nghia gravatar image

updated 2014-01-20 03:26:25 -0500

Updated based on Moster's answer. I totally forgot about the OuputMask.

Modified this line

Mat H = findHomography( obj, scene, CV_RANSAC );


vector<uchar> mask
Mat H = findHomography( obj, scene, CV_RANSAC, 3, mask );
int good_matches = accumulate(mask.begin(), mask.end(), 0);

Threshold on good_matches to determine if you have an object in the scene. You might need to add #include <numeric> for the accumulate function.

edit flag offensive delete link more


Hi thanks for the reply. I should've mentioned I am really new to OpenCV and I don't really understand your approach, would you mind giving me some code examples?

Steve7 gravatar imageSteve7 ( 2014-01-20 01:45:15 -0500 )edit

Hi I tried out your code but this line Mat H = findHomography( obj, scene, CV_RANSAC, 3, mask ); gives error "OpenCV Error: Assertion failed (mtype == type0 || (CV_MAT_CN(mtype) == CV_MAT_CN(type0) && ((1 << type0) & fixedDepthMask) != 0)) in create, file C:/slave/builds/WinInstallerMegaPack/src/opencv/modules/core/src/matrix.cpp, line 1486".

So I swapped char with uchar and the error goes away, am I supposed to do this?

Steve7 gravatar imageSteve7 ( 2014-01-20 02:52:57 -0500 )edit

uchar is correct

Nghia gravatar imageNghia ( 2014-01-20 03:25:48 -0500 )edit

answered 2014-01-20 02:00:34 -0500

Moster gravatar image

updated 2014-01-20 02:00:59 -0500

You could also use the output mask that findHomography creates when you use ransac or lmeds inside of it. Thats what opencv says about it: The best subset is then used to produce the initial estimate of the homography matrix and the mask of inliers/outliers.

So you could take this mask and count the inliers and outliers. Then you would check for example: if inliers > 50 and outliers/inliers < 0.5, then object found. Those numbers are totally random, you need to find them on your own through testing.

edit flag offensive delete link more


Can you provide some code examples on how to retrieve the number of outliers/inliers from the output mask? What should be the data type of the said output mask?

Steve7 gravatar imageSteve7 ( 2014-01-20 03:03:29 -0500 )edit

The one that Nghia posted is good, just with uchar as you mentioned.

Moster gravatar imageMoster ( 2014-01-20 03:13:50 -0500 )edit

answered 2014-01-20 04:50:23 -0500

JohannesZ gravatar image

You could also try the following strategy:

  1. Extract the features
  2. Match the features (via FLANN, for most cases the fastest choice)
  3. Compute a homography via RANSAC
  4. Use the homography to warp the detected object from your scene image in the direction of your pattern image. If you do it right, both images must have the same dimension.
  5. Use a similarity measure for your warped and pattern image. For example a normalized cross-correlation will do fine. Or a normalized correlation coefficient...
edit flag offensive delete link more

Question Tools



Asked: 2014-01-19 23:51:11 -0500

Seen: 1,199 times

Last updated: Jan 20 '14