Revision history [back]

Let me add my remark and other answers as a possible solution.

Rotation invariance can be quickly solved by performing it over multiple rotations. I did it with an LBP classifier and could do a rotation invariant car detection on a 4000x8000 window considering you know a scale range (height of the camera - plane in my case) in about 2 mins. If it can be one in postprocessing, then haar/lbp classifiers could find you the solution. The trick lies in finding enough decent training data and enough negatives to train a robust classifier. Using the approach of bootstrapping can increase the performance of your classifier a lot.

As to the other questions:

I experienced that LBP/HAAR models have about +- 10 degrees of freedom in still robustly detecting an object and still providing enough overlap with the following angle.
Therefore I seperated the 360 degrees into 20 degree bins.
Once an object is found over multiple angles I always choose the most centered angle, which had of course the best score on that object. You could filter out the average, but that gave worse results for me. Picking the median was the best approach.
So basically rotating 36 times, limiting your scale range and merging detections afterwards can get you pretty far.

Let me add my remark and other answers as a possible solution.

And skip color info, to many variance over different car types. Also, in object models you are looking for general features, like windows, wheels, front back, ... that can only be retrieved if you remove car specific info like color and move to a histogram equalized grayscale image.

As to the other questions:

I experienced that LBP/HAAR models have about +- 10 degrees of freedom in still robustly detecting an object and still providing enough overlap with the following angle.
Therefore I seperated the 360 degrees into 20 degree bins.
Once an object is found over multiple angles I always choose the most centered angle, which had of course the best score on that object. You could filter out the average, but that gave worse results for me. Picking the median was the best approach.
So basically rotating 36 times, limiting your scale range and merging detections afterwards can get you pretty far.