# Training and Prediction SVM Java

Hi everyone,

I am trying to train a SVM and use it to classify faces. I am trying to translate some recent OpenCV C++ SVM examples to Java as I cannot find a recent Java one. I am running into some errors and was wondering if someone could point me to a Java tutorial or help me resolve my errors. I am using Java OpenCV 3.4.0.

I am unable to successfully train the SVM and, once I am able to, I am not sure how to use it for classification. What am I doing wrong when trying to train? Once training is fixed, can I test all the testing images at once or do I need to test them one-by-one? How will the classification results be returned? Thank you for your help!

Here is my code:

TRAINING_DATA - Rows:75 Cols:128 (CV_32FC1)

TRAINING_LABELS - Rows:75 Cols:1 (CV_8UC1)

TESTING_DATA - Rows:75 Cols:128 (CV_32FC1)

TESTING_LABELS - Rows:75 Cols:1 (CV_8UC1)

    public static void main(String[] args){

String DATABASE = "yalefaces_aligned";

boolean CLAHE_ON = false;
boolean FACENET_ON = true;
boolean BIF_ON = false;
int BIF_bands = 8;
int BIF_rots = 8;

ArrayList<Integer> training_labels_array = new ArrayList<Integer>();
ArrayList<Integer> testing_labels_array = new ArrayList<Integer>();
Mat TRAINING_DATA = new Mat();
Mat TESTING_DATA = new Mat();

// Load training and testing data
File[] directories = new File(DATABASE).listFiles();
for(int i = 0; i < directories.length; i++){
File[] files = directories[i].listFiles();
for(int j = 0; j < 5; j++){
Mat training_feature = Feature_Extractor.extract_feature(image, CLAHE_ON, FACENET_ON, NET, BIF_ON, BIF_bands, BIF_rots);
TRAINING_DATA.push_back(training_feature);
}
for(int j = 5; j < files.length; j++){
Mat testing_feature = Feature_Extractor.extract_feature(image, CLAHE_ON, FACENET_ON, NET, BIF_ON, BIF_bands, BIF_rots);
TESTING_DATA.push_back(testing_feature);
}
}

// Put training and testing labels into Mats
Mat TRAINING_LABELS = Mat.zeros(TRAINING_DATA.rows(), 1, CvType.CV_8UC1);
for(int i = 0; i < training_labels_array.size(); i++){
TRAINING_LABELS.put(i, 0, training_labels_array.get(i));
}
Mat TESTING_LABELS = Mat.zeros(TESTING_DATA.rows(), 1, CvType.CV_8UC1);
for(int i = 0; i < testing_labels_array.size(); i++){
TESTING_LABELS.put(i, 0, testing_labels_array.get(i));
}

System.out.println("TRAINING_DATA - Rows:" + TRAINING_DATA.rows() + " Cols:" + TRAINING_DATA.cols());
System.out.println("TRAINING_LABELS - Rows:" + TRAINING_LABELS.rows() + " Cols:" + TRAINING_LABELS.cols());
//System.out.println(TRAINING_LABELS.dump());
System.out.println("TESTING_DATA - Rows:" + TESTING_DATA.rows() + " Cols:" + TESTING_DATA.cols());
System.out.println("TESTING_LABELS - Rows:" + TESTING_LABELS.rows() + " Cols:" + TESTING_LABELS.cols());
//System.out.println(TRAINING_LABELS.dump());

// Train SVM
SVM svm = SVM.create();
svm.setKernel(SVM.LINEAR);
svm.setType(SVM.C_SVC);
// errors here
svm.train(TRAINING_DATA, Ml.ROW_SAMPLE, TRAINING_LABELS);

Mat RESULTS = new Mat();
// do i need to predict test features one-by-one?
// what is flags?
svm.predict(TESTING_DATA, RESULTS, flags);
}

edit retag close merge delete

Sort by » oldest newest most voted
• your labels should be CvType.CV_32SC1, not CvType.CV_8UC1
• the yale db is grayscale, it's somewhat "unfair", trying with facenet here, which expects color images. (try with the lfw database instead ?)
• a 50/50 split is bad here, given that the db is somewhat an ordered sequence of lighting conditions. please lookup, how "cross-validation" works, and rather use 5 or 10 fold CV.
• your BIF params will make a large feature vector, which will outweight the smalll(128 only) facenet features
• yes, you can predict a whole set of features, the RESULT mat will have 1 prediction row per sample (but, - float, not integer!). flags in this case should be probably 0, but RAW_OUTPUT (probability (or rather, distance to the margin here), not class result) would be an option)
more

Thank you berak!!! I will try to make these fixes. What do you mean by "outweight"? Since the BIF feature vectors are much larger, this is an unfair comparison? If so, would using PCA or LDA to reduce the size of the BIF feature be reasonable here?

( 2018-04-01 23:53:06 -0500 )edit
1

" unfair comparison" -- that's what i meant. it's like 100000 to 100, PCA would be a way to reduce it, but probably using only 3 bands(or rotations) or so is already better. time to experiment !

BIF alone should already solve yale 99%.

in real life, using color images, maybe use facenet only.

( 2018-04-02 02:23:11 -0500 )edit

I got it working with your fixes! Thanks again Berak!! P.S. Thank you for adding the MACE filter. I am looking forward to experimenting with it and learning more about cancellable biometrics.

( 2018-04-02 15:47:39 -0500 )edit

@berak hi berak. I can't seem to get RAW_OUTPUT to return the probability :( I have tried both to return the probability for a single test sample and all test samples:

 float conf = svm.predict(TESTING_DATA.row(0), RAW_RESULTS, SVM.RAW_OUTPUT);
float conf = svm.predict(TESTING_DATA, RAW_RESULTS, SVM.RAW_OUTPUT);


conf either contains 1.0 in the first case or 0.0 in the second case (and RAW_RESULTS contains the label results). Am I making some mistake? Some posts seem to suggest the probability/confidence it is not available for multi-class svm

( 2018-04-02 22:03:44 -0500 )edit

@berak Sorry! I accidently deleted your comment when trying to delete another comment I had made! I see your point about the conf not really being meaningful here. Thank you for your help on this!!!

( 2018-04-03 01:26:29 -0500 )edit
1

here's a working 2 class example

also remember, that you can transform your multiclass problem into multiple one-against-all-others problems, using a seperate (binary) SVM for each one

( 2018-04-03 01:29:45 -0500 )edit

Thanks! I have one more question: If I wanted to test authentication FAR and FRR using the BIF/FaceNet features, what would be the best way to accomplish that? If I use SVM, would I need to create a two class SVM for each subject? Then classify the test feature with the SVM it is trying to be authenticated as? If that all makes sense, what negative training data would be reasonable to give for each class?

( 2018-04-03 01:35:12 -0500 )edit
1

yes, to use FAR and such, you'd need binary SVM's for each subject. the positive class (and data) is that of the person, the negative features may be those of all the other persons

(but tbh, i found confusion matrices more convincing here)

( 2018-04-03 01:38:10 -0500 )edit
1

Great! Thank you for all your help!!! This has cleared a lot of things up for me :)

( 2018-04-03 01:40:31 -0500 )edit

Official site

GitHub

Wiki

Documentation