I'm not sure BRISK is not already doing what you are trying to do.
Here is the algorithmic principle to build a BRISK descriptor:
- take an image patch (say, 32x32 for example)
- choose locations inside that patch
- for each location, compute the sum of the image values (i.e., integrals) in a small neighborhood around it
- for some pairs of the said locations, compute the difference <integral_1 -="" integral_2=""> and binarize the result by retaining its sign.
This mechanism was first exposed in the BRIEF paper, and I also descrcibe it in a blog post.
Now, the answers to your questions:
- BRISK performs some smoothing in order to be robust to noise in step 4 above: taking differences of integrals is indeed more robust than differences of pixels.
- This smoothing is implemented as an integral. Since all neighborhoods have the same size, taking the mean is not required and keeping the integral avoids the need to divide each value.
- This does naturally explain why a) the smoothed value is higher (no division by the area) and b) integral images are used (= be fast).
If you look at FREAK, you will see an example of a descriptor where the size of the neighborhoods varies and where a normalization by each area is performed.