Dense SIFT in VLFeat and OpenCV integration

asked 2017-01-09 17:21:13 -0600

lovaj
136 ●1 ●2 ●9

updated 2017-01-10 19:56:57 -0600

I'm reading this paper where Dense SIFT is used, in particular (quoting the paper):

We extract SIFT [29] descriptors at 4 scales corresponding to region widths of 16, 24, 32 and 40 pix- els. The descriptors are extracted on a regular densely sam- pled grid with a stride of 2 pixels.

So far so good. However, now I'm trying to understand DSIFT from VLFeat and C API in order to reproduce the strategy above.

From my understanding from this question and this picture (taken from the link above):

Each SIFT descriptor is computed using 4x4 bins. Now, each bin can have different size. So supposing that we have the image img (read using OpenCV), we could do this:

Mat img = imread("img.jpg",CV_LOAD_IMAGE_GRAYSCALE);
// transform image in cv::Mat to float vector
std::vector<float> imgvec;
for (int i = 0; i < img.rows; ++i){
  for (int j = 0; j < img.cols; ++j){
    imgvec.push_back(img.at<unsigned char>(i,j) / 255.0f);                                                                                                                                                                                                        
  }
}

cv::Mat1f descriptors;
for(int i=4; i<10; i+=2){
  VlDsiftFilter *dsift = vl_dsift_new_basic (img.rows, img.cols, 2, i);
  vl_dsift_process (dsift, imgvec.data());
  cv::Mat1f scaleDescs(vl_dsift_get_keypoint_num(dsift), 128, vl_dsift_get_descriptors(dsift));
  descriptors.push_back(scaleDescs);
  free(dsift);
}

Now, I know that I could just "try" this, but understanding if I'm doing something wrong could be very complicate to find the error (not because of the language but because of the logic and the correct API usage). Besides, here we have also several operations from VLFeat to OpenCV.

What do you think about this solution?

Let's suppose I have a grey-scale image read with OpenCV:

cv::Mat img = cv::imread("img.jpg",cv::IMREAD_GRAYSCALE);

Now let's suppose that I want to use it for VLFeat SIFT or Dense SIFT. It's not clear how to convert cv::Mat into a float* to use as input in this library.

In this question this answer propose just to:

if(img.type() == CV_32F)
  float* matData = (float*)img.data;

In this other question:

Mat imgFloat; 
img.convertTo(imgFloat, CV_32F, 1.0/255.0);
float* matData = imgFloat.ptr<float>();

And in these slides:

Mat toFloat; 
img.convertTo(toFloat,CV_32F);
float *vlimage = (float*) tofloat.data;

Which one(s) is (are) correct(s)?

edit retag flag offensive close merge delete

Comments

like stated below, you really should use convertTo(), not a for loop. in your example,

img.at<unsigned char>(i,j) / 255.0f will be 0 for any value < 255 !

berak ( 2017-01-10 00:44:11 -0600 )edit

Second solution is best one because ptr method is used. 1.0/255 is a constant to normalize data between 0 and 1. It is better to check if mat isContinuous too...

LBerger ( 2017-01-10 15:03:12 -0600 )edit

Mmmh I understand, and why should be it important? :D

lovaj ( 2017-01-10 15:06:18 -0600 )edit

If your data are not continuous in memory I don't think results given SIFT will be good. You have to use this method if you don't want to use opencv method to pixel access

LBerger ( 2017-01-10 15:33:21 -0600 )edit

@Tetragramm it's a duplicate question

LBerger ( 2017-01-10 15:38:48 -0600 )edit

I was talking about the normalization ;)

lovaj ( 2017-01-10 16:39:53 -0600 )edit

As for normalization, you can normalize it in the convertTo method (that's the 1.0/255.0 in the code I posted). If the image values don't need to between 0 and 1, you can just leave that parameter off and it defaults to 1.0. Whether it does depends on what you're calling, which is VLFeat and we have no idea.

Tetragramm ( 2017-01-10 20:00:36 -0600 )edit

add a comment

2 answers

Sort by » oldest newest most voted

answered 2017-01-09 18:28:01 -0600

Tetragramm

7376 ●13 ●37

Your opencv code is okay, but it would be faster and more effective to do

Mat img = imread("img.jpg", CV_LOAD_IMAGE_GRAYSCALE); Mat imgFloat; img.convertTo(imgFloat, CV_32F, 1.0/255.0);

Then you check that it's continuous by using imgFloat.isContinuous() (it should be) and then get a pointer to the data by doing imgFloat.ptr<float>(). If it's not continuous, you can just keep doing what you are. It's just faster this way and you don't have to worry about the loops yourself.

Lastly, remember that OpenCV is Row Major. I couldn't find if VLFeat is Row or Column major.

edit flag offensive delete link

Comments

According to this, "If not otherwise specified, matrices in VLFeat are stored in memory in column major order". However, later it's written that "Images I(x,y)I(x,y) are stored instead in row-major order, i.e. one row after the other". Does it mean that I'm doing something wrong?

lovaj ( 2017-01-10 08:05:05 -0600 )edit

That's a question for VLFeat. I know nothing about how it works. Just based on that quote, I think you're ok. that is an "other specification" that images are row major.

Tetragramm ( 2017-01-10 09:11:36 -0600 )edit

BTW, shouldn't I pass the pointer through imgFloat.data() ?

lovaj ( 2017-01-10 10:26:28 -0600 )edit

imgFloat.data() is the old way. It still works, but is not preferred. imgFloat.ptr<float>() is the typesafe, way of accessing the start of the image.

If you are using ROIs and submatrixes, the memory is non-continuous, and you access each row by using imgFloat.ptr<float>(rowNumber);

Tetragramm ( 2017-01-10 19:58:51 -0600 )edit

add a comment

answered 2020-07-02 10:20:15 -0600

yudai
1

After three years, anyone who visited this thread, NOTICE DO NoT use the solution in the recommended answer!

normalize image intensity with 1.0/255.0 will cause vlfeat SIFT detection failed for most of the keypoints! much less than opencv sift. [0,1] obviously is a too small range, it might cause some machine precision problem. Just:

img.convertTo(toFloat,CV_32F);

Use image intensity range [0,255.0] for feature detection. After fall in their 1.0/255.0 trap, I struggle for a long time to find this bug. Now my code can detect even more keypoints than opencv.

Hope can help people who has the same problem

edit flag offensive delete link

add a comment

Dense SIFT in VLFeat and OpenCV integration

Comments

2 answers

Comments

Links

Question Tools

Stats

Related questions

Dense SIFT in VLFeat and OpenCV integration edit

Comments

2 answers

Comments

Links

Question Tools

Stats

Related questions

Dense SIFT in VLFeat and OpenCV integration