SVM predict error on OpenCV4Android

asked 2018-09-26 21:06:22 -0500

caiocanalli gravatar image

Hi guys,

I performed the training of an SVM based on the code below. I used C # for familiarity with the language. I tried doing with Java, directly on Android, but I had some problems as described in this question, mainly due to the absence of the class BOWImgDescriptorExtractor:

C# Trainning

public class Training
{
    KAZE extractor;
    BFMatcher bFMatcher;
    BOWKMeansTrainer bOWKMeansTrainer;
    BOWImgDescriptorExtractor bOWImgDescriptorExtractor;

    Mat descriptorsExtractor;
    Mat descriptorsBOWImgDescriptorExtractor;

    int dictionarySize = 32;

    Dictionary<int, int> images;
    List<int> imagesType = new List<int> { 0, 1 };

    public Training()
    {
        extractor = new KAZE(true, true);
        bFMatcher = new BFMatcher(DistanceType.L2);

        bOWKMeansTrainer = new BOWKMeansTrainer(
            dictionarySize, new MCvTermCriteria(10, 0.001),
            1, KMeansInitType.PPCenters);

        bOWImgDescriptorExtractor =
            new BOWImgDescriptorExtractor(extractor, bFMatcher);

        descriptorsExtractor = new Mat();
        descriptorsBOWImgDescriptorExtractor =
            new Mat(0, dictionarySize, DepthType.Cv32F, 1);

        images = new Dictionary<int, int>();
        images.Add(0, 15);
        images.Add(1, 17);
    }

    public void Train()
    {
        string path = System.Reflection.Assembly.GetExecutingAssembly().Location;
        path = Directory.GetParent(Directory.GetParent(path).ToString()).ToString();

        // Step 1

        for (var i = 0; i < imagesType.Count; i++)
        {
            var type = imagesType[i];

            for (var j = 1; j <= images[type]; j++)
            {
                var file = $@"{path}\Train\{type} ({j}).jpg";
                var image = new Image<Bgr, Byte>(file);

                MKeyPoint[] keyPoints = extractor.Detect(image);

                Mat descriptors = new Mat();
                extractor.Compute(image, new VectorOfKeyPoint(keyPoints), descriptors);

                descriptorsExtractor.PushBack(descriptors);
            }
        }

        bOWKMeansTrainer.Add(descriptorsExtractor);

        // Step 2

        int count = bOWKMeansTrainer.DescriptorCount;
        Console.WriteLine($"Clustering {count} descriptors");

        Mat dictionary = new Mat();
        bOWKMeansTrainer.Cluster(dictionary);

        bOWImgDescriptorExtractor.SetVocabulary(dictionary);

        // Step 3

        Matrix<int> labels;
        List<int> listLabels = new List<int>();

        for (var i = 0; i < imagesType.Count; i++)
        {
            var type = imagesType[i];

            for (var j = 1; j <= images[type]; j++)
            {
                var file = $@"{path}\Train\{type} ({j}).jpg";
                var image = new Image<Bgr, Byte>(file);

                MKeyPoint[] keyPoints = extractor.Detect(image);

                Mat descriptors = new Mat();
                bOWImgDescriptorExtractor.Compute(image, new VectorOfKeyPoint(keyPoints), descriptors);

                descriptorsBOWImgDescriptorExtractor.PushBack(descriptors);
                listLabels.Add(type);
            }
        }

        labels = new Matrix<int>(listLabels.ToArray());

        // Step 4

        SVM svm = new SVM();
        svm.SetKernel(SVM.SvmKernelType.Rbf);
        svm.Type = SVM.SvmType.CSvc;
        svm.Gamma = 0.50625000000000009;
        svm.C = 312.50000000000000;

        svm.TermCriteria = new MCvTermCriteria(100, 0.000001);

        bool result = svm.Train(
            descriptorsBOWImgDescriptorExtractor,
            Emgu.CV.ML.MlEnum.DataLayoutType.RowSample,
            labels);

        svm.Save("output.xml");

        /////////////////////////////////////////////

        var file1 = $@"{path}\Train\{1} ({18}).jpg";
        var img = new Image<Bgr, Byte>(file1);

        MKeyPoint[] keypoints = null;
        var bowDescriptor = new Mat();

        keypoints = extractor.Detect(img);
        bOWImgDescriptorExtractor.Compute(
            img,
            new VectorOfKeyPoint(keypoints),
            bowDescriptor);

        var response = svm.Predict(bowDescriptor);

        Console.WriteLine($"Result {response}");
    }
}

Question

http://answers.opencv.org/question/199980/problem-when-training-svm-with-orb-descriptors-android/

As suggested, I conducted the training using KAZE (UpRight) and BagOfWords. After generating the output.xml file, I loaded it on Android and tried to sort a simple image.

Output.xml

https://pastebin.com/fDm8Ynnx

This is the code I'm using to sort the image:

private void predict(Mat image) {

    Mat grayImage = new Mat();
    Imgproc.cvtColor(image, grayImage, Imgproc.COLOR_BGR2GRAY);

    KAZE kaze = KAZE.create();
    kaze.setUpright(true);
    kaze.setExtended(true);

    MatOfKeyPoint keyPoints = new MatOfKeyPoint();
    kaze.detect(grayImage, keyPoints);

    MatOfFloat descriptors = new MatOfFloat();
    kaze.compute(grayImage, keyPoints, descriptors);

    try {
        float result = svm.predict(descriptors);
    } catch(Exception e) {
        Log.d(TAG, e.getMessage());
    }
}

However, I have obtained this ... (more)

edit retag flag offensive close merge delete

Comments

1

1 - Since I'm only using one image, I do not need to use BagOfWords.

wrong. you need exactly the same type of features, you trained the SVM upon, so you'd need the similar code, you tried to predict() from c#. and it won't be possible on android (from java !) since you can't construct a BOWImgDescriptorExtractor

2 - sure that's possible (and much simpler) but the accuracy is hmmm

i think, this is becoming more and more an XY problem (look it up) , imho, we have to go back to start and ask:

  • what are you trying to achieve ?
  • what data do you have ?
berak gravatar imageberak ( 2018-09-27 00:27:07 -0500 )edit
1

Hi @berak,

I trained svm based on 15 closed-eyed images and 17 open-eyed images (Subsequently, I will increase the sample size).

I'm using the Viola-Jones object identifier to find the right and left eyes through the CameraView. After finding and trimming each eye, I'll pass that cropped image to the SVM to see if it's open or closed, based on my training. I used both approaches to enrich my work.

In that case, I believe I'm going to program in C ++ to use the BOWImgDescriptorExtractor class. I can not see any other solution. Although the accuracy is not so important right now, I'll try the more correct approach.

Thank you.

caiocanalli gravatar imagecaiocanalli ( 2018-09-27 06:49:41 -0500 )edit

hmm, i had some success using HOG features (instead of SIFT/BOW) for this (have a look, there's also a nice dataset)

and ofc. you can still fall back to jni compilation, if you want to keep the BoW approach.

but for sure, you will need more train data. 15 / 17 is a laugh.

berak gravatar imageberak ( 2018-09-27 07:00:07 -0500 )edit
1

I already tried using Hog to train the SVM, but I did not succeed. Only with BoW did I get any results.

Yes, my sample is really ridiculous, but I did that so I just validate the SVM training, which until then was not working.

I will try to implement BoW in C ++ and use JNI. If I get any results, I'll let you know.

Thank you again.

caiocanalli gravatar imagecaiocanalli ( 2018-09-27 08:19:25 -0500 )edit

btw, just curious - what are you doing about left / right eyes ?

berak gravatar imageberak ( 2018-09-29 02:21:49 -0500 )edit
1

Hi @berek, sorry for the delay.

I am making an application for people with disabilities, which do not move any part of the body. Initially, there will be only 4 simple commands. In the future, I want to increase the precision, because the quantity of inputs is very limited and it is very easy to confuse the commands.

caiocanalli gravatar imagecaiocanalli ( 2018-10-01 12:18:01 -0500 )edit

that's a very fine idea !

berak gravatar imageberak ( 2018-10-01 12:30:53 -0500 )edit