HOG optimal training images

opencv
HOG

asked 2017-03-21 08:51:23 -0600

zelade
26 ●1 ●2 ●5

Hello,

im going to train a HOG descriptor on traffic signs. I wonder which pictures are best suited. How many pictures should I calculate for good results? What size should the images have and what influence does this have on the later detection? Is it good for the positive images to leave a narrow edge, so the background becomes visible? Is anyone familiar with it? I would like to estimate, before I take the photos.

Thanks in advance.

edit retag flag offensive close merge delete

Comments

please clarify, do you want to:

train a Hogdescriptor to detect a single class of traffic signs ? (like stop or speed-limit)
or do you want to build a classifier to distinguish between several signs ? (e.g. use HOG features with an SVM or ANN)

berak ( 2017-03-21 11:14:28 -0600 )edit

I wanto to distinguish between several signs. I'm going to learn SVM Light and compute the vector to DetectMultiScale. In my implementation i would use one detectMultiScale for each sign. But I wonder if it is still running in real time. Whats the difference beetween the two options? Or otherwise asked how is the second different from mine?

I just made a test with one street sign and it worked quiet well. I used images size 48 x 48 with a narrow edge. I used 180 pos and 4000 neg images. I have found that close signs are not recognized, is this due to the image size?

zelade ( 2017-03-22 07:11:18 -0600 )edit

detectMultiScale can only be used to detect a single object class

if you use an SVM as a multi-class classifier, you will need some other tool for detection / segmentation, e.g. findContours()

berak ( 2017-03-22 07:23:11 -0600 )edit

Okay I'll look at this. But if i use it as multi-class classifier, the training process differs from my variant right?

So if I understand you correctly, the first step would be to find the contours of the signs in the frame. Then in the second step i calculate the hog features of the contours (rectangles) and then use predict() to classify the sign?

zelade ( 2017-03-22 07:55:12 -0600 )edit

add a comment

2 answers

Sort by » oldest newest most voted

answered 2017-03-22 08:04:27 -0600

berak
32993 ●7 ●81 ●312

updated 2017-03-22 08:19:08 -0600

ok, let's summarize the 2 approaches.

if you want to detect a single object class (detectMultiscale):

train:
crop all positive and negative images to same windowsize(e.g. 24x24). this is the minimum size, that can be detected later
use train_HOG.cpp (from the samples), to train an SVM(regression), and save it
detect:
load the single, pretrained SVM support vector into the HOGDescriptor
detectMultiscale on an (abitrary sized) grayscale image.

if you want to classify multiple traffic signs:

train:
crop all images to same windowsize(e.g. 24x24) (same as above)
get HOG descriptors for each, reshape them to a single row, and push_back all of them into a single large Mat. you also need a "labels" Mat, containing the class id for each descriptor
train an SVM (or ANN, or KNN) with this data and labels (classification)
test:
find contours in large image
get boundingRect() of it
get image ROI (crop) of that rect, resize to the HOG windowsize, you trained your SVM on
get HOG feature from that ROI
predict() with that HOG feature

edit flag offensive delete link

Comments

Thanks for the detailed answer!

zelade ( 2017-03-22 08:32:06 -0600 )edit

How do you label the negative images?

zelade ( 2017-03-22 10:43:40 -0600 )edit

Hi, is there any function like "prediction" for GPU (CUDA c++)? If not, what are the alternatives? How can i implement "prediction" on GPU? Thank you!

olarvik ( 2018-04-24 05:14:54 -0600 )edit

@olarvik, unfortunately, opencv does not have any CUDA ml classes/methods

(there is a cuda based HOG in cudaobjdetect, though)

berak ( 2018-04-24 05:51:03 -0600 )edit

@berak, Thank you for the answer! To my regret it is so. I think I can use the method one-against-all for a multi-classification, implementing training on CPU and test on GPU.

olarvik ( 2018-04-24 06:30:30 -0600 )edit

add a comment

answered 2017-03-23 08:51:24 -0600

zelade
26 ●1 ●2 ●5

Okay i just made it with your steps. First i extracted the features and trained a Linear SVM Classifier.

To classify i made a python script using sliding windows and then predict the window. First i load the Classifier i created, and then i load the testimage. I downscale the image and iterate. In this iteration i use the sliding window. For each window i calculate the HOG features and use predict. The detections are stored in a list. The detector is working, but i got two problems.

First problem is, that it's very slow. Is there a alternative to sliding windows, because they are very slow? Some kind of contour detection to find the signs? The second problem is, that i receive the following DepricationWarning message:

    Traceback (most recent call last)
    File "classify.py", line 79, in <module>
    pred = clf.predict(fd)
  File "/home/pi/.virtualenvs/py2cv3/local/lib/python2.7/site-packages/sklearn/linear_model/base.py", line 336, in predict
    scores = self.decision_function(X)
  File "/home/pi/.virtualenvs/py2cv3/local/lib/python2.7/site-packages/sklearn/linear_model/base.py", line 312, in decision_function
    X = check_array(X, accept_sparse='csr')
  File "/home/pi/.virtualenvs/py2cv3/local/lib/python2.7/site-packages/sklearn/utils/validation.py", line 395, in check_array
    DeprecationWarning)
    DeprecationWarning: Passing 1d arrays as data is deprecated in 0.17 and will raise ValueError in 0.19. Reshape your data either using X.reshape(-1, 1) if your data has a single feature or X.reshape(1, -1) if it contains a single sample.

I don't know why it does appear, but it's so annoying, because it shows up for each iteration (>100 times per image).

edit flag offensive delete link

Comments

yea sliding windows are slow. you'd probably need to do this on different scales, too, even more work.
opencv's HOGDescriptor returns a single column. i got no idea about the sklearn SVM (sorry, can't help with it), but for opencv you'd have to reshape it to a single row feature.

berak ( 2017-03-23 09:37:25 -0600 )edit

I think realtime is not working well then. Okay thank you i'm trying to extract the HOG features with opencv then.

zelade ( 2017-03-23 09:53:32 -0600 )edit

I just reshaped the features:

fd = hog(image, orientations, pixels_per_cell, ... )
fd = fd.reshape(1, -1)
pred = clf.predict(fd)

This solved the problem for me i think. Thank you for the hint.

zelade ( 2017-03-24 04:57:32 -0600 )edit

add a comment

HOG optimal training images

Comments

2 answers

if you want to detect a single object class (detectMultiscale):

if you want to classify multiple traffic signs:

Comments

Comments

Links

Question Tools

Stats

Related questions

HOG optimal training images edit

Comments

2 answers

if you want to detect a single object class (detectMultiscale):

if you want to classify multiple traffic signs:

Comments

Comments

Links

Question Tools

Stats

Related questions

HOG optimal training images