key-point based detection vs HoG/Haar training

asked 2014-03-14 05:00:41 -0600

sy456
101 ●1 ●3 ●9

Hi,

I need to make a decision between key-point based detection vs HoG/Haar training. Sorry for long question in advance but I am really stuck!!

So far, I have been trying to use SIFT, SURF and other key-point based feature extraction methods to detect and track vehicles, pedestrians, traffic signs and lanes. I have to detect these objects at the same time with moving camera. I used that approach for two consecutive frames to analyse the movement of key-points:

detect features --> describe features--> match features between frames --> filter matches

After that I want to group the features onto onto the cars, pedestrians, traffic signs and lanes. I think there should be same way to achieve this. I need to make a data reduction inside the camera because HD cameras produces large data streams. I thought that using these approach I can create a cheap vision pipeline without using any trained data.

However, when I read research paers and talk to experts, i see that if you want to draw a bounding box on the object, you mostly need a trained data. Most of the people uses HoG/Haar training and feed a classifier (SVM/Cascade) for specific objects. Why HoG and Haar is mostly preferred by the community rather than using SIFT or SURF? I cannot convince myself to switch to HoG of Haar!

Also, people use different detectors specific for the object. For instance; HoG + SVM for pedestrian detection; Haar+Cascade for vehicle detection; edge detection+ hough transform+ line fitting for lanes etc.. What I want to do is to find the commonalities and variabilities of these different pipelines and (if possible) come up with a pipeline as generic as possible.

Any advice or pointer to resources??

Regards

edit retag flag offensive close merge delete

add a comment

answered 2014-03-14 15:58:15 -0600

yes123
1412 ●12 ●28 ●52

updated 2014-03-14 19:56:26 -0600

SURF, SIFT etc got a big issue: texture-less objects can't be detected with them because you will not find any keypoints on them. Pedestrian most of the time are best described by their shape and, depending on the camera quality/speed, you will find very low quality keypoints on them.

But if you need to detect textured objects (especially if you have good quality video sequences), keypoint-based techniques are generally better. There are many researches on topic (even with online learning keypoint-based) you can watch a tracking algorithm here for example: Matrioska: Tracking By detection using Keypoints

Comments

thank you for the answer and the link. what do you mean by texture-less objects?

sy456 ( 2014-03-16 14:37:18 -0600 )edit

Textureless object is an object without strong edges inside. Think of a white paper over a desk. Google for more information: https://www.google.com/search?q=textureless+objec#q=textureless+object&spell=1

yes123 ( 2014-03-16 15:19:11 -0600 )edit

add a comment

key-point based detection vs HoG/Haar training

1 answer

Comments

Links

Question Tools

Stats

Related questions

key-point based detection vs HoG/Haar training edit

1 answer

Comments

Links

Question Tools

Stats

Related questions

key-point based detection vs HoG/Haar training