What is the difference between layers and octaves in SIFT/SURF?

asked 2016-04-08 05:52:05 -0500

nyd gravatar image

I read both Lowe's papers ('99&'04) and I would say I understood most of them. I saw all SIFT related classes on youtube, but none explicitly says why we would use both octaves and layers?

I perfectly understood that you get more layers in the same octave by ~calculating the ~Laplacian for different sigmas and then you resample to half the resolution to get the next octave, and again ~calculating the ~Laplacian for the same sigmas as in the first octave. And then you do this as many times as you feel like doing it.

Initially, I thought that you use the layers (multiple sigmas) to find features of different sizes on one image, and then you resample, so that you calculate descriptors on every octave (resampling level) for every feature, so that you get descriptors at different scales that might be better matches for descriptors in the other image at a similar scale. Apparently, I was wrong, only one descriptor is calculated for every feature, as it is calculated out of gradient orientations, so it is ~invariant to scale anyway.

But this leaves me wondering, why do we need to resample, why can't or shouldn't just use a high number of layers and just one octave (no resampling). Is this just because it is cheaper to resample? If yes, why don't we just resample?

Note: ~ sign means sort of. I use it when I know it is not the exact explanation, but the exact one would be longer and it wouldn't add any value to the question.

edit retag flag offensive close merge delete


Quote from Lowe2004:

Once a complete octave has been processed, we resample the Gaussian image that has twice the initial value of σ (it will be 2 images from the top of the stack) by taking every second pixel in each row and column. The accuracy of sampling relative to σ is no different than for the start of the previous octave, while computation is greatly reduced.

As far as I know, the detection of SIFT keypoints is made by searching local extrema in the image of the difference of gaussians.I think that resample allows to avoid to convolve always on the same image size at different sigma parameters but rather reuses the sigma values on an image with a different (smaller) size.

Eduardo gravatar imageEduardo ( 2016-04-10 10:31:13 -0500 )edit

When I started reading the quote, I was like.. I know this, I read it like a thousand times (about 2 times really). Then you're saying the same thing in your own words, and I just got it Then I realized that my question is... unsmart at least, and I now see I was very confused last Friday. I suppose the weekend helped. Thank you!

nyd gravatar imagenyd ( 2016-04-11 02:25:29 -0500 )edit