Ask Your Question
2

HOG optimal training images

asked 2017-03-21 08:51:23 -0600

zelade gravatar image

Hello,

im going to train a HOG descriptor on traffic signs. I wonder which pictures are best suited. How many pictures should I calculate for good results? What size should the images have and what influence does this have on the later detection? Is it good for the positive images to leave a narrow edge, so the background becomes visible? Is anyone familiar with it? I would like to estimate, before I take the photos.

Thanks in advance.

edit retag flag offensive close merge delete

Comments

1

please clarify, do you want to:

  • train a Hogdescriptor to detect a single class of traffic signs ? (like stop or speed-limit)
  • or do you want to build a classifier to distinguish between several signs ? (e.g. use HOG features with an SVM or ANN)
berak gravatar imageberak ( 2017-03-21 11:14:28 -0600 )edit

I wanto to distinguish between several signs. I'm going to learn SVM Light and compute the vector to DetectMultiScale. In my implementation i would use one detectMultiScale for each sign. But I wonder if it is still running in real time. Whats the difference beetween the two options? Or otherwise asked how is the second different from mine?

I just made a test with one street sign and it worked quiet well. I used images size 48 x 48 with a narrow edge. I used 180 pos and 4000 neg images. I have found that close signs are not recognized, is this due to the image size?

zelade gravatar imagezelade ( 2017-03-22 07:11:18 -0600 )edit

detectMultiScale can only be used to detect a single object class

if you use an SVM as a multi-class classifier, you will need some other tool for detection / segmentation, e.g. findContours()

berak gravatar imageberak ( 2017-03-22 07:23:11 -0600 )edit

Okay I'll look at this. But if i use it as multi-class classifier, the training process differs from my variant right?

So if I understand you correctly, the first step would be to find the contours of the signs in the frame. Then in the second step i calculate the hog features of the contours (rectangles) and then use predict() to classify the sign?

zelade gravatar imagezelade ( 2017-03-22 07:55:12 -0600 )edit

2 answers

Sort by ยป oldest newest most voted
4

answered 2017-03-22 08:04:27 -0600

berak gravatar image

updated 2017-03-22 08:19:08 -0600

ok, let's summarize the 2 approaches.

if you want to detect a single object class (detectMultiscale):

  • train:
  • crop all positive and negative images to same windowsize(e.g. 24x24). this is the minimum size, that can be detected later
  • use train_HOG.cpp (from the samples), to train an SVM(regression), and save it
  • detect:
  • load the single, pretrained SVM support vector into the HOGDescriptor
  • detectMultiscale on an (abitrary sized) grayscale image.

if you want to classify multiple traffic signs:

  • train:
  • crop all images to same windowsize(e.g. 24x24) (same as above)
  • get HOG descriptors for each, reshape them to a single row, and push_back all of them into a single large Mat. you also need a "labels" Mat, containing the class id for each descriptor
  • train an SVM (or ANN, or KNN) with this data and labels (classification)

  • test:

  • find contours in large image
  • get boundingRect() of it
  • get image ROI (crop) of that rect, resize to the HOG windowsize, you trained your SVM on
  • get HOG feature from that ROI
  • predict() with that HOG feature
edit flag offensive delete link more

Comments

Thanks for the detailed answer!

zelade gravatar imagezelade ( 2017-03-22 08:32:06 -0600 )edit

How do you label the negative images?

zelade gravatar imagezelade ( 2017-03-22 10:43:40 -0600 )edit

Hi, is there any function like "prediction" for GPU (CUDA c++)? If not, what are the alternatives? How can i implement "prediction" on GPU? Thank you!

olarvik gravatar imageolarvik ( 2018-04-24 05:14:54 -0600 )edit

@olarvik, unfortunately, opencv does not have any CUDA ml classes/methods

(there is a cuda based HOG in cudaobjdetect, though)

berak gravatar imageberak ( 2018-04-24 05:51:03 -0600 )edit

@berak, Thank you for the answer! To my regret it is so. I think I can use the method one-against-all for a multi-classification, implementing training on CPU and test on GPU.

olarvik gravatar imageolarvik ( 2018-04-24 06:30:30 -0600 )edit
0

answered 2017-03-23 08:51:24 -0600

zelade gravatar image

Okay i just made it with your steps. First i extracted the features and trained a Linear SVM Classifier.

To classify i made a python script using sliding windows and then predict the window. First i load the Classifier i created, and then i load the testimage. I downscale the image and iterate. In this iteration i use the sliding window. For each window i calculate the HOG features and use predict. The detections are stored in a list. The detector is working, but i got two problems.

First problem is, that it's very slow. Is there a alternative to sliding windows, because they are very slow? Some kind of contour detection to find the signs? The second problem is, that i receive the following DepricationWarning message:

    Traceback (most recent call last)
    File "classify.py", line 79, in <module>
    pred = clf.predict(fd)
  File "/home/pi/.virtualenvs/py2cv3/local/lib/python2.7/site-packages/sklearn/linear_model/base.py", line 336, in predict
    scores = self.decision_function(X)
  File "/home/pi/.virtualenvs/py2cv3/local/lib/python2.7/site-packages/sklearn/linear_model/base.py", line 312, in decision_function
    X = check_array(X, accept_sparse='csr')
  File "/home/pi/.virtualenvs/py2cv3/local/lib/python2.7/site-packages/sklearn/utils/validation.py", line 395, in check_array
    DeprecationWarning)
    DeprecationWarning: Passing 1d arrays as data is deprecated in 0.17 and will raise ValueError in 0.19. Reshape your data either using X.reshape(-1, 1) if your data has a single feature or X.reshape(1, -1) if it contains a single sample.

I don't know why it does appear, but it's so annoying, because it shows up for each iteration (>100 times per image).

edit flag offensive delete link more

Comments

1
  1. yea sliding windows are slow. you'd probably need to do this on different scales, too, even more work.

  2. opencv's HOGDescriptor returns a single column. i got no idea about the sklearn SVM (sorry, can't help with it), but for opencv you'd have to reshape it to a single row feature.

berak gravatar imageberak ( 2017-03-23 09:37:25 -0600 )edit

I think realtime is not working well then. Okay thank you i'm trying to extract the HOG features with opencv then.

zelade gravatar imagezelade ( 2017-03-23 09:53:32 -0600 )edit
1

I just reshaped the features:

fd = hog(image, orientations, pixels_per_cell, ... )
fd = fd.reshape(1, -1)
pred = clf.predict(fd)

This solved the problem for me i think. Thank you for the hint.

zelade gravatar imagezelade ( 2017-03-24 04:57:32 -0600 )edit

Question Tools

1 follower

Stats

Asked: 2017-03-21 08:51:23 -0600

Seen: 2,306 times

Last updated: Mar 23 '17