SVM predict on OpenCV: how can I extract the same number of features

asked 2017-11-01 05:04:05 -0500

lezan gravatar image

updated 2017-11-12 10:52:25 -0500

I am play with OpenCV and SVM to make a classifier to predict facial expression. I have no problem to classify test dataset, but when I try to predict a new image, I get this:

OpenCV Error: Assertion failed (samples.cols == var_count && samples.type() == CV_32F) in cv::ml::SVMImpl::predict

Error is pretty clear and I have a different number of columns, but of the same type. I do not know how to achieve that, because I have a matrix of dimensions 1xnumber_of_features, but numbers_of_features is not the same of the trained and tested samples. How can I extract the same number of features from another image? Am I missing something?

To train classifier I did:

  • Detect face and save ROI;
  • Sift to extract features;
  • kmeans to cluster them;
  • bag of words to get the same numbers of features for each image;
  • pca to reduce;
  • train on train dadaset;
  • predict on test dadaset;

On the new image I did the same thing.

I tried to resize the new image to the same size, but nothing, same error ( and different number of columns, aka features). Vectors are of the same type (CF_32F).

After succesfuly trained my classifier, I save SVM model in this way

svmClassifier->save(baseDatabasePath);

Then I load it when I need to do real time prediction in this way

cv::Ptr<cv::ml::SVM> svmClassifier;
svmClassifier = cv::ml::StatModel::load<ml::SVM>(path);

Then loop,

while (true) 
{
    getOneImage();
    cv::Mat feature = extractFeaturesFromSingleImage();
    float labelPredicted = svmClassifier->predict(feature);
    cout << "Label predicted is: " << labelPredicted << endl;
}

But predict returns the error. feature dimension is 1x66, for example. As you can see below, I need like 140 features

<?xml version="1.0"?>
<opencv_storage>
<opencv_ml_svm>
  <format>3</format>
  <svmType>C_SVC</svmType>
  <kernel>
    <type>RBF</type>
    <gamma>5.0625000000000009e-01</gamma></kernel>
  <C>1.2500000000000000e+01</C>
  <term_criteria><epsilon>1.1920928955078125e-07</epsilon>
    <iterations>1000</iterations></term_criteria>
  <var_count>140</var_count>
  <class_count>7</class_count>
  <class_labels type_id="opencv-matrix">
    <rows>7</rows>
    <cols>1</cols>
    <dt>i</dt>
    <data>
      0 1 2 3 4 5 6</data></class_labels>
  <sv_total>172</sv_total>

<support_vectors>

I do not know how achieve 140 features, when SIFT, FAST or SURF just give me around 60 features. What am I missing? How can I put my real time sample on the same dimension of train and test dataset?

Some code.

As preprocessing (I try to extract some code, because there are more code wrapped).

cv::Mat image;
cv::Mat gray;
cv::Mat output;

image = cv::imread(imagePath[imageId], CV_LOAD_IMAGE_COLOR);
cv::cvtColor(image, gray, CV_BGR2GRAY);

double clipLimit = 4.0f;
Size tileGridSize(8, 8);
Ptr<CLAHE> clahe = cv::createCLAHE(2.0, tileGridSize);
clahe->apply(gray, output);

cv::CascadeClassifier faceCascade;
faceCascade.load(baseDatabasePath + "/" + cascadeDataName2);

std::vector<cv::Rect> faces;
faceCascade.detectMultiScale(output, faces, 1.2, 3, 0, cv::Size(50, 50));

int bestIndex = 0;
int maxWidth = 0;
for (unsigned int i = 0; i < faces.size(); ++i) 
{
    if (faces[i].width > maxWidth) 
    {
        bestIndex = i;
        maxWidth = faces[i].width;
    }
}

faceROI = output(faces[bestIndex]);
cv::resize(faceROI, faceROI, cv::Size(widthImageOutputResize, heightImageOutputResize));
imwrite(outputPath + "/" + currentFilename, faceROI);

Extract features with sift and push on ... (more)

edit retag flag offensive close merge delete

Comments

your analysis is correct, you need exactly the same number of features(cols) for training and testing.

it's nice, that you split up your code into pieces, and try to explain the steps, but there are some parts missing. could you put the whole code on a gist or the like ?

berak gravatar imageberak ( 2017-11-01 05:25:59 -0500 )edit

@berak Hello, thanks for you answer. What is actually missing? Are you refering to training and testing?

lezan gravatar imagelezan ( 2017-11-01 05:57:39 -0500 )edit
1

yes, the proprocessing for your testing seems to be missing.

  • pca: you're overwriting the feature Mat again, and again ? looks wrong. (also, i don't see the need for pca at all. the size of your bow vector is the number of bins in kmeans, and that should be the size of your SVM features)
  • bow dictionary: you're using the labels from kmeans while training, but what do you do, when testing ? (there's no kmeans, i suppose ?)
  • opencv's BoW code (you missed that, right ?) actually matches features to the bow dictionary vectors, to make the histograms
berak gravatar imageberak ( 2017-11-02 03:44:10 -0500 )edit

Hello @berak,

  1. I will add preprocessing too.
  2. I save each feature into a file, each cycle. My bad, I missed that. I thought that 1000 bins was too much when a great number of that have a value of 0.
  3. On feature extraction I am going to save into a file each labels, then I will use them on training. I will add it, because have no sense to write here.
  4. I think that is here feature.at<float>(0, bin) += 1;

Anyway I am going to add some details.

lezan gravatar imagelezan ( 2017-11-02 04:39:57 -0500 )edit
1
  1. good. let's see.
  2. then use less bins, or try to get more keypoints per image. indeed, with ~60 keypoints and 1000 bins, you have a lot of zeros. however, remember, that zeros can be important, too ;)
  3. that's ok for deriving your algorithm, but later, when testing on (before unseen) images "in the wild", you won't be able to do so, right ?
  4. yes, that's the same. i'm more arguing about "using the labels from kmeans", because you won't have any later
berak gravatar imageberak ( 2017-11-02 04:46:27 -0500 )edit
  1. Added, I hope it is okay.
  2. I tried with a different number of bins and with 1000 I get the best score. How can I extract more keypoints?
  3. RIght. I am able to use labels only on training&testing, because I know labels, but not with images "in the wild". Is this approach wrong?
  4. I think you get the point, because my problem is with unseen images, but I do know where I failed and how I can fix.
lezan gravatar imagelezan ( 2017-11-02 04:55:24 -0500 )edit

hmm, for testing, you have to get feature "bins", a histogram from your bow dictionary, too, and that should be the input to svm->predict(). again a feature vector with nbins elements.

i don't see this anywhere in your code, can it be, you're just not doing it ?

(would explain, why the sizes don't match)

btw: https://gilscvblog.com/2013/08/23/bag...

berak gravatar imageberak ( 2017-11-02 05:00:25 -0500 )edit
1

If with "testing" you are refering to unseen image, I did the same thing (not posted here, I am bit shy to discover what I wrote there eheh could be terrible). Anyway, I think we arrived: on the unseen sample I extract feature and I get around 66 keypointsX128 (with sift), but now? How can I clusterize it with same number of bins (for example 1000)? Moreover, do I need to use centers computed before? I do not know if I able to write what is my problem.

lezan gravatar imagelezan ( 2017-11-02 05:18:40 -0500 )edit
  • the centers from kmeans are your BoW dictionary.
  • for both SVM testing and training, you don't use the SIFT features directly, but a "bow histogram" vector, calculated like this:
  • for each keypoint/feature, walk the dictionary, and compare each feature(e.g. l2 distance). the one with the shortest distance gets it's resp. bin increased.
  • so, for each image, you end up with a 1xnBins sized feature vec (with nKeypoints non-null entries since you will have different kp count between images, you should also normalize each histogram
    i don't think, you can use the labels from kmeans for this. as you need to use exactly the same algo for train & test. calculating all those distances may seem excessive, but i don't see any way to avoid it.
berak gravatar imageberak ( 2017-11-02 05:53:34 -0500 )edit

I read it 10 times but I am bit confused. In particular, I am refering to the third point.

  1. Okay, I get it.
  2. Get it too.
  3. Is that a solution for unseen images, right? Not for testing and training.
  4. I saw that, if you see vector are already normalized.
  5. Is there a better way of doing that? I am saying: put into the trash all code and re-doing.
lezan gravatar imagelezan ( 2017-11-02 06:18:07 -0500 )edit