I would recommend you read this, this and this to truly understand how the HOG descriptor works.
EDIT I:
Sorry I did a poor job at clearly explaining a few things to you so let us take a step back.
HOG: The main idea of the algorithm is to simply create a feature descriptor of any given image regardless of it belonging to a class or not. How? It uses image gradients and direction. Given an image, it tries to find the dominant direction of an edge.
The algorithm itself is tied to no particular machine learning technique i.e. you can create your own classifier or object detector from these computed HOGs.
In both methods, you need training data regardless.
Classifier Your dataset consists of different classes, compute the HOGs of each of them then use your choice of classifiers to categorize them. So given a new image, the model will try finding its HOG then tell you which label it closely associates with. This is were the confusion matrix comes into play because you have multiple classes.
Object Detector Generally folks train the HOGs using SVM (a classifier) - which I believe is the same case for OpenCV as well. Then by using detectMultiScale
, it will apply the sliding window technique to grab a portion of the image, pass it on to the trained model. The model then spits out a probability. If this value passes a set threshold then it keeps the bounding box associated with this area. At the end it'd apply a non-max suppression.
For the case of OpenCV, HOGs are used as object detectors. Their classification model was trained to recognize just a single class (I stand to be corrected) which is why there is no confusion matrix.
Now, you can go ahead and train your own multi-class classifier of HOGs should you choose because of 2.
I highly recommend you read this paper. Along with it, a tutorial and forum will do you good too!
Again, I am really sorry for not being extremely precise in my prior responses earlier. Hope this clarifies things.
Cheers :)
instead of HOGDescriptor i suggest you to use mobilenet_ssd_python.py