Which object recognition algorithm should I use?

answered 2014-01-02 14:37:44 -0600

Guanta
6736 ●6 ●25 ●79

I am not quite sure what a RC plane is, and which objects you want to recognize. Anyways, some hints:

It is no problem to give just a smaller region of interest (ROI) to any detector/descriptor method in which you want to find correspondences.
SIFT and SURF are both having a patent, furthermore SIFT is one of the slowest keypoint based methods.
I recommend you to use a binary descriptor for a fast matching (you can use Hamming-distance then) and for a (typical) low computational cost, e.g. ORB/ORB or BRISK/FREAK, then you also don't need to compute a ROI since these methods are typically fast enough to process the whole image.

However note that all these keypoint based methods work only for images or ROIs which have some structure, so plane objects with no edges won't be detectable since no keypoints will be generated there, but I guess birds are no problem.

edit flag offensive delete link

Comments

Those are RC Planes: http://www.hobbytron.com/RCAirplanes.html But in my case its a bit more "professional" :D! Thanks for the answere! If SURF is patented how come it is implemented in an open source library? I am not planning on commercial use of the software.

JAyThaRevo ( 2014-01-02 20:49:56 -0600 )edit

Well, with patents its always somewhat complicated and I just wanted to make sure that you are aware of it. Typically it is no problem to use them for research purposes, thus SIFT and SURF are part of OpenCV, however settled in the non-free module.

Btw.: Also have a look at the cascade-classifier of OpenCV, this could also work for your problem, they are typically used for pedestrian-detection. It depends if you have a tracking or a recognition problem (which could of course be combined as well) and the type of objects you have (multiple objects of one class versus one object, etc.), whether you go for a keypoint based method (e.g. to track a certain object) or for a recognition algorithm which includes a classifier (and its a priori training).

Guanta ( 2014-01-03 07:09:40 -0600 )edit

Thanks for your answer. After a short break I back again at this problem. I think that a classifier does not suit my problem. I have video footage so you will be able to see what I poorly tried to describe: https://www.dropbox.com/s/gmjlqcnwq3tezos/sample.mp4 Since the plane will be seen from many different angles it will be a huge task to train a classifier. What do you think now after you have seen the video? Thanks again for your help!

JAyThaRevo ( 2014-01-07 15:29:12 -0600 )edit

After watching the video and understanding now your task. You basically have two problems here. 1. after figuring out the moving object you want to classify it whether it is an RC plane or not, 2. after that you want to track it. Your question considered mainly task 1. This looks to me like a standard classification problem which you can solve by training a classifier with suitable feature vectors. If you have time limitations and want a fast decision I'd go for HOG as features and either a linear SVM or a boosting method as classifier. If you want to analyze the video later, so time available, you can also try other feature descriptors, e.g. ensemble of local features (bag of (visual) words), or tryout combinations of shape and color features.

Guanta ( 2014-01-08 05:01:30 -0600 )edit

Thanks so much! So, do you suggest that I take the HOG features from frame to frame or to train the classifier. I think the latter will be rather difficult as the size and direction of the plane constantly change. Then I still have my tracking problem. I want to build something like a trace. And the trace needs to be a certain length before I want another event to be triggered (not CV related). Thanks again!

JAyThaRevo ( 2014-01-09 18:20:44 -0600 )edit

Well, you could train a cascade-classifier and then predict every frame if you find sth. Alternatively, as soon as you detected a moving object, you compute features from just the object and classify it. Of course you need to train the classifier beforehand with several positives (take many variations) and negatives (this you'd have to do with the cascade-classifier framework of OpenCV as well).

Guanta ( 2014-01-10 03:48:11 -0600 )edit

Hey man! A little heads up: https://www.youtube.com/watch?v=8Z2Ba4p83h8 I am not done yet. But thats the direction I am going. I am a little disappointed about the performance. This will never be usable for real time applications :(. What I am trying to do here: 1) Extract features for every bounding rectangle using ORB (keypoints & descriptor). 2) Try to match every descriptor with descriptors from previous frame 3) Once there have been more than 2 consecutive matches found I am using a Cascade Classifier (Haar features) for object classification (Plane Yes/No?)

So, what do you think. Does that make sense?

JAyThaRevo ( 2014-01-20 18:49:55 -0600 )edit

Hm step 3 could be replaced by a single classifier call, i.e. training a complete cascade is not necessary. A cascade should be trained if you want to reject many features in a fast way, but since you have already found the object in charge already beforehand this is imho not necessary. Thus, you can either: classify using the contour then (see e.g. Shotton et al.:" "Multiscale categorical object recognition using contour fragments.", or compute features from the object and match them. For example you could further use the ORB descriptors and encode them in a bag of words(bow) manner and decide upon the bow-descriptor if it is the object or not (or compute a HOG descriptor, or any other descriptor, many ways lead to Rome ;) ).

Guanta ( 2014-01-22 02:44:42 -0600 )edit

Hey! Me again... So I tried a lot in the past week and found out that the keypoints which are extracted from these small patches can not reliable be matched. I would get 6 keypoints from one image and 5 keypoints from the same section only one frame later. So there is a minor change. But I would only get 2 or 3 good matches. On the other hand comparing two completely different patches might gave me 6/6 good matches. So I think I have to discard this approach. I will now try a BoW, shape detection or template matching approach. What do you think?

JAyThaRevo ( 2014-02-01 03:03:59 -0600 )edit

BoW is only useful if you have some more features than just 6, maybe you could densely sample features to get some more. Template matching is easy to apply, however difficult to make rotation and scale invariant. Shape matching could be worth trying.

Guanta ( 2014-02-01 07:23:21 -0600 )edit

add a comment

Which object recognition algorithm should I use?

1 answer

Comments

Links

Question Tools

Stats

Related questions

Which object recognition algorithm should I use? edit

1 answer

Comments

Links

Question Tools

Stats

Related questions

Which object recognition algorithm should I use?