Ask Your Question
2

see the dictionary content from Bag of Words

asked 2017-11-28 08:03:36 -0600

sama_z gravatar image

updated 2018-01-11 07:50:39 -0600

Hi,

Is it possible to see the dictionary content as visual words from a bow? I found interest points and descriptors of a group of images and clustered them. Now I have a dictionary of visual words which is zero and ones. My question is that is there any way to see the visual words like a piece of image?

Any help would be appreciated.

edit retag flag offensive close merge delete

Comments

what exactly did you cluster there ? SIFT descriptors ? did you expect something like this ?

berak gravatar imageberak ( 2017-11-28 08:17:46 -0600 )edit

I used SIFT to find interest points and descriptor and Kmean to cluster them. Yes exactly like that picture. Is it possible?

sama_z gravatar imagesama_z ( 2017-11-28 08:20:00 -0600 )edit

But I think the picture shows before clustering. I also want to see how they clustered!

sama_z gravatar imagesama_z ( 2017-11-28 08:22:39 -0600 )edit
1

unfortunately, descriptors are not images, but a kind of histogram.

so maybe you could draw a curve, or a barchart (like in the link above) for the 128 values

(each row in your dictionary Mat corresponds to a SIFT descriptor)

this looks interesting, too:

taken from here

berak gravatar imageberak ( 2017-11-28 08:51:58 -0600 )edit

Thanks a lot @berak for the comment

I have an xml file of keypoints and descriptors of 93 images. the file looks like this:

<keypoints>
  3.3213031005859375e+02 8.6436431884765625e+01 1.9616262912750244e+00
  1.0643557739257813e+02 1.3890451751649380e-02 6357503 -1
....
<descriptor type_id="opencv-matrix">
  <rows>734</rows>
  <cols>128</cols>
  <dt>f</dt>
  <data>
    55. 18. 1. 0. 0. 1. 4. 36. 88. 14. 0. 0. 0. 14. 50. 54. 43. 0. 0. 0.
    0. 11. 110. 95. 90. 0. 0. 0. 0. 7. 25. 89. 62. 4. 0. 0. 0. 28. 55.
...

And I also could have 93 pictures like this one: http://vgg.fiit.stuba.sk/wp-uploads/2... Each circle is an unclustered word now (correct?) But the question is that how could I create such histogram from those numbers? It seems complex to me!! Any idea?

sama_z gravatar imagesama_z ( 2017-11-29 03:03:40 -0600 )edit

the circles are a visualization of the keypoints (position, orientation and response, i guess), not of the descriptors at those places. iirc, there's a drawKeypoints() function for that already builtin. but again, you won't have any kp for your clusters (they're interpolated, right ?)

no real idea for the descriptors, though i personally like that 4x4 square with 8 orientation bins inside. i'm just afraid, that its some work to achieve it, and that it will get quite cluttered, if you have many.

berak gravatar imageberak ( 2017-11-29 03:29:58 -0600 )edit

Thanks @berak for your time. exactly drawKeypoints() could draw a circle of keypoints. Each circle is a visualization of the keypoint and when we are talking about BoW each circle is a word, for the dictionary we are going to make later. And of course they are unclustered and before clustering. Am I right or totally wrong :-| I think if I could figure out how data were stored in descriptors it might be possible and helpful to draw a histogram at this level (before clustering) for each image. Do you have any idea how data were stored as descriptors? We already know that each descriptor is a histogram of 4*4 window in 8 direction. But how those 128 element were saved? Next I should find a way to figure out can I see how data were clustered?!....

sama_z gravatar imagesama_z ( 2017-11-29 05:44:23 -0600 )edit

1 answer

Sort by ยป oldest newest most voted
0

answered 2018-01-06 23:02:02 -0600

sama_z gravatar image

updated 2018-01-08 03:53:29 -0600

As an answer to my question, for each Keypoints 6 or 7 different parameters were stored. In my case it's 7 parameters including::

KeyPoint (float x, float y, float _size, float _angle=-1, float _response=0, int _octave=0, int _class_id=-1)

x x-coordinate of the keypoint

y y-coordinate of the keypoint

_size keypoint diameter

_angle keypoint orientation

_response keypoint detector response on the keypoint (that is, strength of the keypoint)

_octave pyramid octave in which the keypoint has been detected

_class_id object id

for more info see:keypoints

And descriptor is consist of 128 dimensional feature vectors for each keypoint. It will have 128*number of keypoints value. Every 128 value belongs to one keypoint. Someone else might has the same question. Hope this helps.

edit flag offensive delete link more

Question Tools

1 follower

Stats

Asked: 2017-11-28 08:03:36 -0600

Seen: 558 times

Last updated: Jan 08 '18