Kmeans and Bag of Generic Words

asked 2017-04-13 05:44:50 -0600

21 ●1 ●3 ●5

updated 2017-04-13 06:08:22 -0600

Hi guys,

i'm trying to implement my Bag of words with my own descriptor. I already did the kmeans part and it works perfectly. Now my question is which function in open CV I have to use to build the histogram and complete the training part of the code.

Consider that I used this code to compute the vocabulary:

Mat ComputeDictionary (Mat& features, const int K) {
int retries = 3;
int flags = KMEANS_PP_CENTERS;
Mat bestlabels, centers;
// K-Means function call    
kmeans(features, K, bestlabels, TermCriteria( TermCriteria::EPS+TermCriteria::COUNT,100,0.001), retries, flags, centers);
return centers;
}

Consider that my "features" matrix is a matrix with "number of samples * number of descriptors per sample " rows and "descriptor dimension" columns.

Now, I don't know how to proceed (I know the theory but I don't now how to implement it in c++)...could you please help me?

answered 2017-04-13 06:55:28 -0600

berak
32993 ●7 ●81 ●312

updated 2017-04-13 06:58:17 -0600

hmm, opencv uses feature matching, to compute the histograms

for your custom descriptor, i guess, you have to implement something similar on your own.

alternatively to the histograms as final features, you could store distances to the centers, or residuals even.

edit flag offensive delete link

Comments

so, basically, I have to this:

build my own funtion to compute distances between each descriptor sample and the words of the vocabulary finding the minimum one.
Compute the histogram
Create the new feature matrix (samples x histogram dimension)
Use SVM chi squared to do training.

then the same for a test sample.

right?

alexMm1 ( 2017-04-13 07:33:16 -0600 )edit

good plan ;)

there's norm(a,b) for the distance
make a 1d Mat histogram(1, nclusters, CV_32F, 0,0f), then just increase the bin: hist.at<float>(0, closest_cluster_id) += 1; it's also a good idea to normalize() it in the end.
start with an empty mat, and push_back() those histograms, one by one (on a single row)
i'd still prefer LINEAR, but that's up to experimenting on your side !

for the testsample it would just need steps 1. and 2.

berak ( 2017-04-13 08:28:47 -0600 )edit

great thanks

alexMm1 ( 2017-04-14 04:38:14 -0600 )edit

add a comment

Kmeans and Bag of Generic Words

1 answer

Comments

Links

Question Tools

Stats

Related questions

Kmeans and Bag of Generic Words edit

1 answer

Comments

Links

Question Tools

Stats

Related questions

Kmeans and Bag of Generic Words