SVM cross validation parameters optimisation and accuracy
I use the following code to train the svm using k-fold cross-validation but the prediction accuracy is low. What I am doing wrong and how to programmatically calculate the accuracy of the classifier using cross-validation.
Log.i(TAG,"Training..."); params.set_svm_type(CvSVM.C_SVC); params.set_kernel_type(CvSVM.RBF); params.set_C(1.0); params.set_degree(0.0); params.set_coef0(0.0); params.set_gamma(1.0); params.set_term_crit(new TermCriteria(TermCriteria.EPS, 10000, 1e-12));
// k-fold cross validation
int kFolds = 10;
CvParamGrid C = new CvParamGrid();
CvParamGrid p = new CvParamGrid();
CvParamGrid nu = new CvParamGrid();
CvParamGrid gamma = new CvParamGrid();
CvParamGrid coeff = new CvParamGrid();
CvParamGrid degree = new CvParamGrid();
gamma.set_step(0.0);
// initialize SVM object to avoid being Null object
classifier = new CvSVM(trainingData, classes, new Mat(), new Mat(), params);
classifier.train_auto(trainingData, classes, new Mat(), new Mat(), params, kFolds, C, gamma, p, nu, coeff, degree, false);
classifier.save(XML.toString());
Log.i(TAG,"Training Done & Trained Model Saved");
Well there is a plausible chance that you simply have not got enough data to train a better model? We need some more info on that first before we can make sure its due to the parameters.
Thanks @StevenPuttemans for your comment. I have 88 record with 40 negative images and 48 positive images. I suppose this due to the parameters as I have all the predictions of the test dataset 1.
SVM on that small amount of data will never be optimal. Start by increasing your training set.
Thanks @StevenPuttemans, I will try to increase my data but my question is can I get at 90% accuracy with this data as we need it for real testing or it will be difficult?.
Getting 90% certainty is always a challenge. But without more insight info in your application, making a random guess if it will work is impossible :D
The problem is that my colleague who work on matlab get nearly 97% accuracy with this data. Is the problem with the opencv and android as they are open source or other criteria. however, my app is a diagnosis app which give result of a disease based on analyzing colors in image.
wait, if your collegue already trained with matlab, then you know the parameters right? Then you can simply set all these parameters and not add a grid for them?
Thanks, we will try this.
Please suggest any solution I can try as we need to start testing from 1st of September and I have no enough time --> I am sorry, but this will not convince me further to provide support. Don't you have a professor or assistant that is assigned to this class?
Sorry for any inconvenience I caused. I will delete my comment as it seems inappropriate.