Training your own SVM OpenCV

asked 2018-08-01 03:21:43 -0600

Macmik gravatar image

updated 2018-08-01 04:30:50 -0600

Hi everyone. I have a project which bases on training my own SVM to detect particular silhouette. I wrote some code but it doesn't work so I want to be sure that I did everything ok. I want also mention that I am new in OpenCV world so sorry if I write some silly things but here is how I imagine it:

1) First of all, I collect some materials to train. I create two folders with some positive samples (my silhouette to detect - just about 200 samples) and negative folder (everything different - about 1300 samples). Pictures are 64x128.

2) The second step is to create a feature vector of each sample. Here I am not sure which information from the picture is important. In my case, I use HOG on my pictures and put gradient of each pixel to feature vector but maybe it is better to take for example H value from HSV picture? Here is some code:

std::vector<float> featureVector;
hog.compute(image, featureVector);

3) I creat Mat to keep all of this vectors. 1 row = 1 feature vector. So for example when I have 1500 samples then I got matrix with 1500 rows. This is how I do it:

trainingData = cv::Mat::ones(sizePositive + sizeNegative, 3780, CV_32FC1);
labels = cv::Mat::zeros(sizePositive + sizeNegative, 1, CV_32S);
for (int i = 0; i < sizePositive; i++)
{
    for (int j = 0; j < 3780; j++)
    {
        trainingData.at<float>(i, j) = features[j];
    }
    labels.at<int>(i, 0) = 1;
}

for (int i = 0; i < sizeNegative; i++)
{
    std::vector<float> features = calculateFeatures(filenamesNegative[i]);

    for (int j = 0; j < 3780; j++)
    {
        trainingData.at<float>(i+sizePositive, j) = features[j];
    }
    labels.at<int>(i+sizePositive, 0) = -1;
}

In this step, I also create labels matrix and put 1 when it is a positive sample and -1 when it is a negative sample.

4) After this, I am ready to train? I have trainingData with my features vectors and my labels which defines which sample was positive and which was negative. ( I checked XML files with my labels and trainingData and they look fine). Here is how I train my SVM:

cv::Ptr<cv::ml::SVM> svm = cv::ml::SVM::create();
// set the parameters
svm->setType(cv::ml::SVM::C_SVC);
svm->setKernel(cv::ml::SVM::RBF);

// create training data object
cv::Ptr<cv::ml::TrainData> tData = cv::ml::TrainData::create(trainingData, cv::ml::SampleTypes::ROW_SAMPLE, labels);

// train svm to optimal parameters using opencv autotraining
svm->trainAuto(tData);

What do you think? It is good thinking? I am doing something wrong? Because it seems that this SVM doesn't learn anything :/ I will be very grateful for your help and some advice.

P.S. Here is my database with negative and positive samples. At the beginning I wanted to train my svm to just recognize people (not particular silhouette) : https://drive.google.com/drive/folder...

edit retag flag offensive close merge delete

Comments

you need a LINEAR SVM kernel (so it can be compresed into a single support vector for the test later)

also note, that opencv's trainHOG.cpp (in the samples) uses a C_SVR regression for this purpose, not a classification.

and please: if your labels Mat is integer, you must access it as labels.at<int>(i) not float.

berak gravatar imageberak ( 2018-08-01 03:35:40 -0600 )edit

Hi berek, thanks for your answer. 1) I tried to use the linear kernel but the effect was pretty the same, SVM doesn't recognize anything. 2) Can you say some more about this trainHOG, how it appeals to my svm later? 3) Yes, you are right I used int in my code but I forgot to change it in the old version which I put here.

Macmik gravatar imageMacmik ( 2018-08-01 03:50:59 -0600 )edit
berak gravatar imageberak ( 2018-08-01 03:54:53 -0600 )edit

if you share your positive images i will try to help

sturkmen gravatar imagesturkmen ( 2018-08-01 04:05:53 -0600 )edit

I have seen this code before but for me, it is very similar to what I'm doing. Sorry but openCV isn't my strong side and I still don't get what I am doing wrong :/

Macmik gravatar imageMacmik ( 2018-08-01 04:08:01 -0600 )edit

@sturkmen I put link to my samples

Macmik gravatar imageMacmik ( 2018-08-01 04:32:15 -0600 )edit

see my PR about updating trainHOG.cpp. PS your data is prapared using INRIA dataset but it is really not good for a successful training.

sturkmen gravatar imagesturkmen ( 2018-08-01 04:49:05 -0600 )edit

Ok, I will look at it. Yes, it is INRIA dataset, why it is not good? So how to prepare better database?

Macmik gravatar imageMacmik ( 2018-08-01 05:00:40 -0600 )edit