Actually it does exactly the same as any other multiscale detection approach out there. It is used to create the image scale pyramid for multi scale detection using a single scale model.
It takes the following steps
- Start from a new test image, go with a sliding window approach (where window is model size) through the image and store all detections.
- Then downscale the image according to the scale factor, where 1.10 means you dowscale both dimensions (rows and cols) with 10%.
- Again perform detections.
- Warp the image detections back to the original size of the image using the downscale factor.
- Do this until one of the dimensions of the image reaches the largest model dimension (because then sliding window is no longer possible).
- Merge detections on similar locations over different scales.