Training and Prediction SVM Java

asked 2018-04-01 16:32:24 -0600

711 ●1 ●7 ●21

updated 2018-04-01 16:36:00 -0600

Hi everyone,

I am trying to train a SVM and use it to classify faces. I am trying to translate some recent OpenCV C++ SVM examples to Java as I cannot find a recent Java one. I am running into some errors and was wondering if someone could point me to a Java tutorial or help me resolve my errors. I am using Java OpenCV 3.4.0.

I am unable to successfully train the SVM and, once I am able to, I am not sure how to use it for classification. What am I doing wrong when trying to train? Once training is fixed, can I test all the testing images at once or do I need to test them one-by-one? How will the classification results be returned? Thank you for your help!

Here is my code:

TRAINING_DATA - Rows:75 Cols:128 (CV_32FC1)

TRAINING_LABELS - Rows:75 Cols:1 (CV_8UC1)

TESTING_DATA - Rows:75 Cols:128 (CV_32FC1)

TESTING_LABELS - Rows:75 Cols:1 (CV_8UC1)

    public static void main(String[] args){
        System.loadLibrary(Core.NATIVE_LIBRARY_NAME);

        String DATABASE = "yalefaces_aligned";
        Net NET = Dnn.readNetFromTorch("openface.nn4.small2.v1.t7");

        boolean CLAHE_ON = false;
        boolean FACENET_ON = true;
        boolean BIF_ON = false;
        int BIF_bands = 8;
        int BIF_rots = 8;

        ArrayList<Integer> training_labels_array = new ArrayList<Integer>();
        ArrayList<Integer> testing_labels_array = new ArrayList<Integer>();
        Mat TRAINING_DATA = new Mat();
        Mat TESTING_DATA = new Mat();

        // Load training and testing data
        File[] directories = new File(DATABASE).listFiles();
        for(int i = 0; i < directories.length; i++){
            File[] files = directories[i].listFiles();
            for(int j = 0; j < 5; j++){
                Mat image = Imgcodecs.imread(files[j].getAbsolutePath());
                Mat training_feature = Feature_Extractor.extract_feature(image, CLAHE_ON, FACENET_ON, NET, BIF_ON, BIF_bands, BIF_rots);
                TRAINING_DATA.push_back(training_feature);
                training_labels_array.add((i+1));
            }
            for(int j = 5; j < files.length; j++){
                Mat image = Imgcodecs.imread(files[j].getAbsolutePath());
                Mat testing_feature = Feature_Extractor.extract_feature(image, CLAHE_ON, FACENET_ON, NET, BIF_ON, BIF_bands, BIF_rots);
                TESTING_DATA.push_back(testing_feature);
                testing_labels_array.add((i+1));
            }
        }

        // Put training and testing labels into Mats
        Mat TRAINING_LABELS = Mat.zeros(TRAINING_DATA.rows(), 1, CvType.CV_8UC1);
        for(int i = 0; i < training_labels_array.size(); i++){
            TRAINING_LABELS.put(i, 0, training_labels_array.get(i));
        }
        Mat TESTING_LABELS = Mat.zeros(TESTING_DATA.rows(), 1, CvType.CV_8UC1);
        for(int i = 0; i < testing_labels_array.size(); i++){
            TESTING_LABELS.put(i, 0, testing_labels_array.get(i));
        }

        System.out.println("TRAINING_DATA - Rows:" + TRAINING_DATA.rows() + " Cols:" + TRAINING_DATA.cols());
        System.out.println("TRAINING_LABELS - Rows:" + TRAINING_LABELS.rows() + " Cols:" + TRAINING_LABELS.cols());
        //System.out.println(TRAINING_LABELS.dump());
        System.out.println("TESTING_DATA - Rows:" + TESTING_DATA.rows() + " Cols:" + TESTING_DATA.cols());
        System.out.println("TESTING_LABELS - Rows:" + TESTING_LABELS.rows() + " Cols:" + TESTING_LABELS.cols());
        //System.out.println(TRAINING_LABELS.dump());


        // Train SVM
        SVM svm = SVM.create();
        svm.setKernel(SVM.LINEAR);
        svm.setType(SVM.C_SVC);
        // errors here
        svm.train(TRAINING_DATA, Ml.ROW_SAMPLE, TRAINING_LABELS);

        Mat RESULTS = new Mat();
        // do i need to predict test features one-by-one?
        // what is flags?
        svm.predict(TESTING_DATA, RESULTS, flags);
    }

answered 2018-04-01 22:04:28 -0600

berak
32993 ●7 ●81 ●312

updated 2018-04-01 22:50:58 -0600

your labels should be CvType.CV_32SC1, not CvType.CV_8UC1
the yale db is grayscale, it's somewhat "unfair", trying with facenet here, which expects color images. (try with the lfw database instead ?)
a 50/50 split is bad here, given that the db is somewhat an ordered sequence of lighting conditions. please lookup, how "cross-validation" works, and rather use 5 or 10 fold CV.
your BIF params will make a large feature vector, which will outweight the smalll(128 only) facenet features
yes, you can predict a whole set of features, the RESULT mat will have 1 prediction row per sample (but, - float, not integer!). flags in this case should be probably 0, but RAW_OUTPUT (probability (or rather, distance to the margin here), not class result) would be an option)

edit flag offensive delete link

Comments

Thank you berak!!! I will try to make these fixes. What do you mean by "outweight"? Since the BIF feature vectors are much larger, this is an unfair comparison? If so, would using PCA or LDA to reduce the size of the BIF feature be reasonable here?

phillity ( 2018-04-01 23:53:06 -0600 )edit

" unfair comparison" -- that's what i meant. it's like 100000 to 100, PCA would be a way to reduce it, but probably using only 3 bands(or rotations) or so is already better. time to experiment !

BIF alone should already solve yale 99%.

in real life, using color images, maybe use facenet only.

berak ( 2018-04-02 02:23:11 -0600 )edit

I got it working with your fixes! Thanks again Berak!! P.S. Thank you for adding the MACE filter. I am looking forward to experimenting with it and learning more about cancellable biometrics.

phillity ( 2018-04-02 15:47:39 -0600 )edit

@berak hi berak. I can't seem to get RAW_OUTPUT to return the probability :( I have tried both to return the probability for a single test sample and all test samples:

 float conf = svm.predict(TESTING_DATA.row(0), RAW_RESULTS, SVM.RAW_OUTPUT);
 float conf = svm.predict(TESTING_DATA, RAW_RESULTS, SVM.RAW_OUTPUT);

conf either contains 1.0 in the first case or 0.0 in the second case (and RAW_RESULTS contains the label results). Am I making some mistake? Some posts seem to suggest the probability/confidence it is not available for multi-class svm

phillity ( 2018-04-02 22:03:44 -0600 )edit

@berak Sorry! I accidently deleted your comment when trying to delete another comment I had made! I see your point about the conf not really being meaningful here. Thank you for your help on this!!!

phillity ( 2018-04-03 01:26:29 -0600 )edit

here's a working 2 class example

also remember, that you can transform your multiclass problem into multiple one-against-all-others problems, using a seperate (binary) SVM for each one

berak ( 2018-04-03 01:29:45 -0600 )edit

Thanks! I have one more question: If I wanted to test authentication FAR and FRR using the BIF/FaceNet features, what would be the best way to accomplish that? If I use SVM, would I need to create a two class SVM for each subject? Then classify the test feature with the SVM it is trying to be authenticated as? If that all makes sense, what negative training data would be reasonable to give for each class?

phillity ( 2018-04-03 01:35:12 -0600 )edit

yes, to use FAR and such, you'd need binary SVM's for each subject. the positive class (and data) is that of the person, the negative features may be those of all the other persons

(but tbh, i found confusion matrices more convincing here)

berak ( 2018-04-03 01:38:10 -0600 )edit

Great! Thank you for all your help!!! This has cleared a lot of things up for me :)

phillity ( 2018-04-03 01:40:31 -0600 )edit

add a comment

Training and Prediction SVM Java

1 answer

Comments

Links

Question Tools

Stats

Related questions

Training and Prediction SVM Java edit

1 answer

Comments

Links

Question Tools

Stats

Related questions

Training and Prediction SVM Java