Ask Your Question

Revision history [back]

Basically what HOG.detectMultiscale does is take your original image and create an image pyramid from it, using your resize factor. This means that the image gets downscaled each time until it reaches a size that is smaller than the model (which would be impossible to perform), and it also upscales the image until a level that you define.

This gives you the ability of detecting people at a single model scale, throughout different images scales, meaning that if a detection happens at a specific layer, the bounding box will be rescaled the same amount as the original image was to reach that pyramid layer.

Using this technique you can detect multiple people scales at only a single model scale, which is computationally less expensive than training a model for each possible scale and running those over the single image.