Ask Your Question
1

Training and Prediction SVM Java

asked 2018-04-01 16:32:24 -0600

phillity gravatar image

updated 2018-04-01 16:36:00 -0600

Hi everyone,

I am trying to train a SVM and use it to classify faces. I am trying to translate some recent OpenCV C++ SVM examples to Java as I cannot find a recent Java one. I am running into some errors and was wondering if someone could point me to a Java tutorial or help me resolve my errors. I am using Java OpenCV 3.4.0.

I am unable to successfully train the SVM and, once I am able to, I am not sure how to use it for classification. What am I doing wrong when trying to train? Once training is fixed, can I test all the testing images at once or do I need to test them one-by-one? How will the classification results be returned? Thank you for your help!

Here is my code:

TRAINING_DATA - Rows:75 Cols:128 (CV_32FC1)

TRAINING_LABELS - Rows:75 Cols:1 (CV_8UC1)

TESTING_DATA - Rows:75 Cols:128 (CV_32FC1)

TESTING_LABELS - Rows:75 Cols:1 (CV_8UC1)

    public static void main(String[] args){
        System.loadLibrary(Core.NATIVE_LIBRARY_NAME);

        String DATABASE = "yalefaces_aligned";
        Net NET = Dnn.readNetFromTorch("openface.nn4.small2.v1.t7");

        boolean CLAHE_ON = false;
        boolean FACENET_ON = true;
        boolean BIF_ON = false;
        int BIF_bands = 8;
        int BIF_rots = 8;

        ArrayList<Integer> training_labels_array = new ArrayList<Integer>();
        ArrayList<Integer> testing_labels_array = new ArrayList<Integer>();
        Mat TRAINING_DATA = new Mat();
        Mat TESTING_DATA = new Mat();

        // Load training and testing data
        File[] directories = new File(DATABASE).listFiles();
        for(int i = 0; i < directories.length; i++){
            File[] files = directories[i].listFiles();
            for(int j = 0; j < 5; j++){
                Mat image = Imgcodecs.imread(files[j].getAbsolutePath());
                Mat training_feature = Feature_Extractor.extract_feature(image, CLAHE_ON, FACENET_ON, NET, BIF_ON, BIF_bands, BIF_rots);
                TRAINING_DATA.push_back(training_feature);
                training_labels_array.add((i+1));
            }
            for(int j = 5; j < files.length; j++){
                Mat image = Imgcodecs.imread(files[j].getAbsolutePath());
                Mat testing_feature = Feature_Extractor.extract_feature(image, CLAHE_ON, FACENET_ON, NET, BIF_ON, BIF_bands, BIF_rots);
                TESTING_DATA.push_back(testing_feature);
                testing_labels_array.add((i+1));
            }
        }

        // Put training and testing labels into Mats
        Mat TRAINING_LABELS = Mat.zeros(TRAINING_DATA.rows(), 1, CvType.CV_8UC1);
        for(int i = 0; i < training_labels_array.size(); i++){
            TRAINING_LABELS.put(i, 0, training_labels_array.get(i));
        }
        Mat TESTING_LABELS = Mat.zeros(TESTING_DATA.rows(), 1, CvType.CV_8UC1);
        for(int i = 0; i < testing_labels_array.size(); i++){
            TESTING_LABELS.put(i, 0, testing_labels_array.get(i));
        }

        System.out.println("TRAINING_DATA - Rows:" + TRAINING_DATA.rows() + " Cols:" + TRAINING_DATA.cols());
        System.out.println("TRAINING_LABELS - Rows:" + TRAINING_LABELS.rows() + " Cols:" + TRAINING_LABELS.cols());
        //System.out.println(TRAINING_LABELS.dump());
        System.out.println("TESTING_DATA - Rows:" + TESTING_DATA.rows() + " Cols:" + TESTING_DATA.cols());
        System.out.println("TESTING_LABELS - Rows:" + TESTING_LABELS.rows() + " Cols:" + TESTING_LABELS.cols());
        //System.out.println(TRAINING_LABELS.dump());


        // Train SVM
        SVM svm = SVM.create();
        svm.setKernel(SVM.LINEAR);
        svm.setType(SVM.C_SVC);
        // errors here
        svm.train(TRAINING_DATA, Ml.ROW_SAMPLE, TRAINING_LABELS);

        Mat RESULTS = new Mat();
        // do i need to predict test features one-by-one?
        // what is flags?
        svm.predict(TESTING_DATA, RESULTS, flags);
    }
edit retag flag offensive close merge delete

1 answer

Sort by ยป oldest newest most voted
2

answered 2018-04-01 22:04:28 -0600

berak gravatar image

updated 2018-04-01 22:50:58 -0600

  • your labels should be CvType.CV_32SC1, not CvType.CV_8UC1
  • the yale db is grayscale, it's somewhat "unfair", trying with facenet here, which expects color images. (try with the lfw database instead ?)
  • a 50/50 split is bad here, given that the db is somewhat an ordered sequence of lighting conditions. please lookup, how "cross-validation" works, and rather use 5 or 10 fold CV.
  • your BIF params will make a large feature vector, which will outweight the smalll(128 only) facenet features
  • yes, you can predict a whole set of features, the RESULT mat will have 1 prediction row per sample (but, - float, not integer!). flags in this case should be probably 0, but RAW_OUTPUT (probability (or rather, distance to the margin here), not class result) would be an option)
edit flag offensive delete link more

Comments

Thank you berak!!! I will try to make these fixes. What do you mean by "outweight"? Since the BIF feature vectors are much larger, this is an unfair comparison? If so, would using PCA or LDA to reduce the size of the BIF feature be reasonable here?

phillity gravatar imagephillity ( 2018-04-01 23:53:06 -0600 )edit
1

" unfair comparison" -- that's what i meant. it's like 100000 to 100, PCA would be a way to reduce it, but probably using only 3 bands(or rotations) or so is already better. time to experiment !

BIF alone should already solve yale 99%.

in real life, using color images, maybe use facenet only.

berak gravatar imageberak ( 2018-04-02 02:23:11 -0600 )edit

I got it working with your fixes! Thanks again Berak!! P.S. Thank you for adding the MACE filter. I am looking forward to experimenting with it and learning more about cancellable biometrics.

phillity gravatar imagephillity ( 2018-04-02 15:47:39 -0600 )edit

@berak hi berak. I can't seem to get RAW_OUTPUT to return the probability :( I have tried both to return the probability for a single test sample and all test samples:

 float conf = svm.predict(TESTING_DATA.row(0), RAW_RESULTS, SVM.RAW_OUTPUT);
 float conf = svm.predict(TESTING_DATA, RAW_RESULTS, SVM.RAW_OUTPUT);

conf either contains 1.0 in the first case or 0.0 in the second case (and RAW_RESULTS contains the label results). Am I making some mistake? Some posts seem to suggest the probability/confidence it is not available for multi-class svm

phillity gravatar imagephillity ( 2018-04-02 22:03:44 -0600 )edit

@berak Sorry! I accidently deleted your comment when trying to delete another comment I had made! I see your point about the conf not really being meaningful here. Thank you for your help on this!!!

phillity gravatar imagephillity ( 2018-04-03 01:26:29 -0600 )edit
1

here's a working 2 class example

also remember, that you can transform your multiclass problem into multiple one-against-all-others problems, using a seperate (binary) SVM for each one

berak gravatar imageberak ( 2018-04-03 01:29:45 -0600 )edit

Thanks! I have one more question: If I wanted to test authentication FAR and FRR using the BIF/FaceNet features, what would be the best way to accomplish that? If I use SVM, would I need to create a two class SVM for each subject? Then classify the test feature with the SVM it is trying to be authenticated as? If that all makes sense, what negative training data would be reasonable to give for each class?

phillity gravatar imagephillity ( 2018-04-03 01:35:12 -0600 )edit
1

yes, to use FAR and such, you'd need binary SVM's for each subject. the positive class (and data) is that of the person, the negative features may be those of all the other persons

(but tbh, i found confusion matrices more convincing here)

berak gravatar imageberak ( 2018-04-03 01:38:10 -0600 )edit
1

Great! Thank you for all your help!!! This has cleared a lot of things up for me :)

phillity gravatar imagephillity ( 2018-04-03 01:40:31 -0600 )edit

Question Tools

1 follower

Stats

Asked: 2018-04-01 16:32:24 -0600

Seen: 2,038 times

Last updated: Apr 01 '18