Ask Your Question
0

Why do the values returned from Brisk's smoothedIntensity it are very large, much larger than intensity values?

asked 2013-08-22 13:44:24 -0600

Hi,

I have a question regarding Brisk's function "smoothedIntensity".

Why do the values returned from it are very large, much larger than intensity values?

Should they be the size of intensity values (since they are smoothed intensities)? And why does Brisks uses an integral image?

I replaced the implementation with the following simple implementation that gives the sum of the 3x3 box around the pixel, could you please tell me if it's correct?

inline int
        BRISK4::smoothedIntensity(const cv::Mat& image, const cv::Mat& integral, const float key_x,
        const float key_y, const unsigned int scale, const unsigned int rot,
        const unsigned int point) const
    {

        // get the float position
        const BriskPatternPoint& briskPoint = patternPoints_[scale * n_rot_ * points_ + rot * points_ + point];
        const float xf = briskPoint.x + key_x;
        const float yf = briskPoint.y + key_y;
        const int x = int(xf);
        const int y = int(yf);
        const int& imagecols = image.cols;

        // get the sigma:
        const float sigma_half = briskPoint.sigma;
        const float area = 4.0f * sigma_half * sigma_half;

        // calculate output:
        //Gil changes here the returned val will be the sum of patch of 3X3
        int ret_val = image.at<uchar>(y-1,x-1) + image.at<uchar>(y-1,x) + image.at<uchar>(y-1,x +1) + 
                      image.at<uchar>(y,x-1) + image.at<uchar>(y,x) + image.at<uchar>(y,x +1) + 
                      image.at<uchar>(y+1,x-1) + image.at<uchar>(y+1,x) + image.at<uchar>(y+1,x +1); 

        return ret_val;
    }

The current smoothedIntensity implementation confused me, so I'm really not sure anymore.

Thanks,

Gil.

edit retag flag offensive close merge delete

Comments

Several questions: why do you want to modify it? This function is only used internally in the descriptor-computation. I also doubt very much that your implementation is correct in any sense - why should be the sum of 9 pixels be a smoothed version? At least I'd take the average... The integral-image can be and is used for a fast calculation of sums of areas. So, again, I wouldn't touch this function, apparently the author(s) spent much brain energy in it and it probably has its sense ;) .

Guanta gravatar imageGuanta ( 2013-08-22 15:31:37 -0600 )edit

First, thank you very much for your response!!

I'm changing the function for research purposes. Basically, instead of comparing pairs of pixels, I want to compare small regions of size 3X3. I know it doesn't mean a smoothed version of the region, but the region itself.

So, if I do want to compare regions of size 3X3 instead of single pixels, is my implementation correct?

Thank you very much again!!

GilLevi gravatar imageGilLevi ( 2013-08-22 15:39:39 -0600 )edit

Well, the sum is correct (btw: imagecols, sigma_half, and area aren't needed then anymore) , but I doubt that it is correct in the sense of the function, but I guess you don't want this at all - I would then also call the function differently...

Guanta gravatar imageGuanta ( 2013-08-22 15:59:04 -0600 )edit

Thanks for your answer. You are indeed right, I should rename the function.

There is something that isn't clear to me - the original implementation should return the smoothed intensity at a certain pixel. So how come it returns values that are much larger then intensity values (255)?

Thanks!

GilLevi gravatar imageGilLevi ( 2013-08-22 16:11:01 -0600 )edit

Hm not so sure, maybe to widen the value range for the whole interpolation stuff? Sorry I am really not a Brisk expert, maybe s.o. else has a better idea...

Guanta gravatar imageGuanta ( 2013-08-22 16:39:09 -0600 )edit

1 answer

Sort by ยป oldest newest most voted
1

answered 2013-08-23 04:10:54 -0600

Emmanuel gravatar image

I'm not sure BRISK is not already doing what you are trying to do.

Here is the algorithmic principle to build a BRISK descriptor:

  1. take an image patch (say, 32x32 for example)
  2. choose locations inside that patch
  3. for each location, compute the sum of the image values (i.e., integrals) in a small neighborhood around it
  4. for some pairs of the said locations, compute the difference <integral_1 -="" integral_2=""> and binarize the result by retaining its sign.

This mechanism was first exposed in the BRIEF paper, and I also descrcibe it in a blog post.

Now, the answers to your questions:

  1. BRISK performs some smoothing in order to be robust to noise in step 4 above: taking differences of integrals is indeed more robust than differences of pixels.
  2. This smoothing is implemented as an integral. Since all neighborhoods have the same size, taking the mean is not required and keeping the integral avoids the need to divide each value.
  3. This does naturally explain why a) the smoothed value is higher (no division by the area) and b) integral images are used (= be fast).

If you look at FREAK, you will see an example of a descriptor where the size of the neighborhoods varies and where a normalization by each area is performed.

edit flag offensive delete link more

Comments

Thank you for your answer. If the smoothing is implemented as an integral with neighborhoods of the same size, what is the purpose of the sigma's ? I thought there is a Gaussian smoothing.

Thanks again.

GilLevi gravatar imageGilLevi ( 2013-08-23 05:51:20 -0600 )edit

The sigma sets the width of the neighborhood. For such small sizes, integral images are a decent approximation to Gaussian smoothing (and you will learn by experience that many CV research papers implement the same trick and name it Gaussian...). You really should have a look at FREAK's source that is more geared towards what you plan to do.

Emmanuel gravatar imageEmmanuel ( 2013-08-23 06:45:22 -0600 )edit

@Emmanuel, thanks for the answer! One thing still remains unclear to me - if the sigmas sets the width of the neighborhood then the neighborhoods of the different pattern points are not of the same size.

How can one compare a pattern point from the outer rings that has a large neighborhood (thus a large integral image sum) with a pattern point from the inner rings that has a small neighborhood (thus a small integral image sum)?

Am I missing something here?

Thanks!

GilLevi gravatar imageGilLevi ( 2013-08-23 07:52:04 -0600 )edit

Sorry, I'm not sure of my memory here... but as far as I remember, BRISK does not do any cross-ring diferences (unlike FREAK, but FREAK normalizes by the area).

Emmanuel gravatar imageEmmanuel ( 2013-08-23 09:17:19 -0600 )edit

BRISK does take cross-ring differences. It takes all the pairs where the distance between points is bellow 5.75 and that happens also for cross-ring pair, if the rings are no more then one ring apart from each other.

If I had any way to upload an image, I would upload a visualization of all of BRISK's pairs.

Thanks for the answer!

GilLevi gravatar imageGilLevi ( 2013-08-23 09:40:22 -0600 )edit

Question Tools

1 follower

Stats

Asked: 2013-08-22 13:44:24 -0600

Seen: 674 times

Last updated: Aug 23 '13