Ask Your Question
0

BOW Kmeans trainer problem

asked 2014-02-06 08:49:58 -0600

salvo gravatar image

Hello,

i`m using BOWKmeansTrainer to categorize images in my project. Everything works fine, when BOW trainer runs only once. But, when i was trying to find optimal number of descriptors per image class in categorization, I have discovered some strange behaviour with BOW trainer. In an example code, there is a simple program, which calculate descriptors from images (stored in directory "test") and then calculate vocabulary using BOWKmeansTrainer. I used two BOW trainers, "bowtrainer" and "bowtrainer1" with the same input descriptors and parameters. But, when I looked into saved results, I have found two different vocabularies.

Can anyone help me? What am I doing wrong ? I`m using Opencv 2.4.8 (also tested on 2.4.5)

Here is an example:

#include "opencv2/highgui/highgui.hpp"
#include "opencv2/imgproc/imgproc.hpp"
#include "opencv2/features2d/features2d.hpp"
#include "opencv2/nonfree/nonfree.hpp"
#include "opencv2/video/background_segm.hpp"
#include "opencv2/video/tracking.hpp"
#include "opencv2/ml/ml.hpp"
#include <fstream>
#include <iostream>
#include <string>
#include <vector>
#include <iterator>
#include <algorithm>
#include <dirent.h>
#include <unistd.h>
#include <sys/stat.h>
#include <sys/types.h>


using namespace cv;
using namespace std;

int main()
{
int m_debug = 1;
initModule_nonfree();

Ptr<FeatureDetector> detector;
Ptr<FeatureDetector> control_detector;
Ptr<DescriptorExtractor> extractor;
Ptr<DescriptorMatcher> matcher;
Ptr<BOWImgDescriptorExtractor> bow_ext;


detector = FeatureDetector::create("SURF");
extractor = DescriptorExtractor::create("SURF");
matcher = DescriptorMatcher::create("BruteForce");


string path_to_images = "test";
string filepath;
Mat desc;
Mat voc,img,voc1;
vector<KeyPoint> kp;
DIR *dp;
struct dirent *dirp;
struct stat filestat;
FileStorage fs;

int dictionarySize = 1000;
TermCriteria tc(CV_TERMCRIT_ITER, 10, 0.001);
int retries = 1;
int flags = KMEANS_PP_CENTERS;

BOWKMeansTrainer bowtrainer(1000,tc, retries, flags);
BOWKMeansTrainer bowtrainer1(1000,tc, retries, flags);

dp=opendir(path_to_images.c_str());

while((dirp = readdir(dp))){
    filepath = path_to_images+"/"+dirp->d_name;
    if (stat( filepath.c_str(), &filestat )) continue;
    if (S_ISDIR( filestat.st_mode ))         continue;
    img=imread(filepath);

    if(img.empty()){
        continue;
    }
    if(m_debug){
        cout<<filepath<<endl;
    }
    detector->detect(img, kp);
    extractor->compute(img, kp, desc);
    bowtrainer.add(desc);
    bowtrainer1.add(desc);
}
closedir(dp);


cout << "Calculating vocabulary for "<<bowtrainer.descripotorsCount()<<" descriptors "<<endl;

voc = bowtrainer.cluster();
cout<<"Saving vocabulary ..."<<endl;
fs.open("voc.yml", FileStorage::WRITE);
fs<<"voc"<<voc;
fs.release();
cout << endl;


cout << "Calculating vocabulary for "<<bowtrainer.descripotorsCount()<<" descriptors "<<endl;

voc1 = bowtrainer1.cluster();
cout<<"Saving vocabulary ..."<<endl;
fs.open("voc1.yml", FileStorage::WRITE);
fs<<"voc1"<<voc1;
fs.release();
cout << endl;

return 0;

}

edit retag flag offensive close merge delete

1 answer

Sort by ยป oldest newest most voted
1

answered 2014-02-06 09:16:48 -0600

Guanta gravatar image

Imho this is correct behavior, since the cluster()-function calls kMeans, and kMeans is initialized randomly, i.e. the output clusters will differ.

edit flag offensive delete link more

Comments

Thanks for your answer, I understand that, but when you run program ones, then copy the vocabulary to another files (e.g. - mv voc.yml voc_copy.yml, mv voc1.yml voc1_copy.yml), and run program again, the voc_copy.yml from the first run will be the same as voc.yml from second run and also, the voc_copy1.yml from the first run will be the same as voc1.yml. From this, may I suggest, that KMeans has the same random initial clusters (flag KMEANS_PP_CENTERS) when I run the program and then, in next cluster process (in the same program running), KMeans follows predetermined values to initialize cluster centres ? Thanks again.

salvo gravatar imagesalvo ( 2014-02-07 01:38:02 -0600 )edit

Yes, the seed for the random generator is apparantly the same, i.e. each program execution will give the same output, but in the program itself the k-means will be initialized differently.

Guanta gravatar imageGuanta ( 2014-02-07 05:06:32 -0600 )edit

Thak`s again.

salvo gravatar imagesalvo ( 2014-02-11 00:57:15 -0600 )edit

Question Tools

1 follower

Stats

Asked: 2014-02-06 08:49:58 -0600

Seen: 1,272 times

Last updated: Feb 06 '14