OpenCV SVM (RBF) low test accuracy at 10 % on MNIST
I tried to train OpenCV SVM on MNIST dataset and I got weird results, i.e. test accuracy at 10 %. Any idea what went wrong? Thanks in advance.
Here are the parameters I used:
Ptr<SVM> model = SVM::create();
model->setType(SVM::C_SVC);
model->setKernel(SVM::RBF);
model->setC(10);
model->setGamma(0.01);
Training:
model->train(tdata);
Testing: (from letter_recog.cpp sample)
static void test_and_save_classifier(const Ptr<StatModel>& model,
const Mat& data, const Mat& responses,
int ntrain_samples, int rdelta,
const string& filename_to_save)
{
int i, nsamples_all = data.rows;
double train_hr = 0, test_hr = 0;
// compute prediction error on
// train data[0 , ..., ntrain_samples-1]; and
// test data[0 , ..., nsamples_all-1]
before = static_cast<double>(getTickCount());
for( i = 0; i < nsamples_all; i++ )
{
Mat sample = data.row(i);
// The method is used to predict the response for a new sample.
// In case of a classification, the method returns the class label.
float r = model->predict( sample ); /// sample is the row feature vector
// Tally correct classifications
// +1 if prediction is correct
// +0 if prediction is wrong
r = std::abs(r + rdelta - responses.at<int>(i)) <= FLT_EPSILON ? 1.f : 0.f;
if( i < ntrain_samples )
train_hr += r;
else
test_hr += r;
}
after = static_cast<double>(getTickCount());
duration_in_ms = 1000.0*(after - before)/getTickFrequency();
cout << "Prediction for all data completed after "<< duration_in_ms <<" ms...\n";
cout << "Average prediction time per sample is "<< duration_in_ms/nsamples_all <<" ms.\n";
test_hr /= nsamples_all - ntrain_samples;
train_hr = ntrain_samples > 0 ? train_hr/ntrain_samples : 1.;
/// Note: 0 training samples here will give 100 % training error
printf( "Recognition rate: train = %.2f%%, test = %.2f%%\n",
train_hr*100., test_hr*100. );
if( !filename_to_save.empty() )
{
model->save( filename_to_save );
}
}
Test run:
trainVecLabels.size() = 60000
trainVecLabels[0] = 5
testData.size() = 10000
testVecLabels.size() = 10000
testVecLabels[0] = 7
data.size() = [784 x 70000]
responses.size() = [1 x 70000]
Training the classifier ...
Training completed after 11.1005 mins...
Testing and saving ...
Prediction for all data completed after 443715 ms...
Average prediction time per sample is 6.33879 ms.
Recognition rate: train = 24.92%, test = 10.09%
Training-auto without feature vector normalization:
Ptr<SVM> model = SVM::create();
model->setType(SVM::C_SVC);
model->setKernel(SVM::RBF);
model->trainAuto( tdata, 10 );
Result (10% accuracy):
[root@cobalt MNIST]# gcv SVM_MNIST.cpp -o SVM_MNIST
[root@cobalt MNIST]# ./SVM_MNIST
trainData.size() = 60000
init done
opengl support available
trainVecLabels.size() = 60000
trainVecLabels[0] = 5
testData.size() = 10000
testVecLabels.size() = 10000
testVecLabels[0] = 7
data.size() = [784 x 70000]
responses.size() = [1 x 70000]
Training the classifier ...
Training completed after 3044.38 mins...
Testing and saving ...
Prediction for all data completed after 426744 ms...
Average prediction time per sample is 6.09635 ms.
Recognition rate: train = 24.92%, test = 10.09%
Training-auto (with feature vector normalization) raised the accuracy to almost 80 %!
[root@cobalt MNIST]# gcv SVM_MNIST_Normalized.cpp -o SVM_MNIST_Normalized
[root@cobalt MNIST]# ./SVM_MNIST_Normalized
trainData.size() = 60000
trainVecLabels.size() = 60000
trainVecLabels[0] = 5
testData.size() = 10000
testVecLabels.size() = 10000
testVecLabels[0] = 7
data.size() = [784 x 70000]
responses.size() = [1 x 70000]
Training the classifier ...
Training completed after 2356.62 mins...
Testing and saving ...
Prediction for all data completed after 950690 ms ...
Are you using any feature extraction technique prior to training and testing?
I am guessing he is using the pure pixel information, then this result is actually not that bad :D Like @Lorena GdL said, start with filtering specific properties using feature extraction techniques like HOG, LBP, HAAR, ...
@Lorena GdL No feature extraction was utilized; just plain intensity values (28 x 28); That's why I had a total data.size() = [784 x 70000] for both training and test.
@StevenPuttemans, thanks a lot for the comment. Yes, I'm planning to incorporate feature extraction afterwards, but I feel 10 % SVM test accuracy on MNIST is a bit strange.
I am reading a technical report: Handwritten digits recognition using OpenCV:
OpenCV-Mnist-Report
And it reported an error rate of 1.77% using SVM-RBF (with normalization) and an accuracy of 89.9% without normalization using raw pixels!
@mkc: I wouldn't trust that. I don't know what the author did but he just had luck. Using raw pixel values makes no sense at all. I'd say he computed error on the same training data or so... And please, don't call that paper a technical report, it's so far away to be one
Training-auto (with feature vector normalization) didn't help out (a lot) either, but the accuracy is almost 80 % euhm am I wrong or is this a huge difference and thus a big help?
I was initially expecting 98 % (1.77% error) from the OpenCV-Mnist-Report I read... Well anyway, I already edited it: Training-auto (with feature vector normalization) raised the accuracy to almost 80 %!
Again, like Lorena said, it is quite difficult to grasp what they did to get that accuracy. They might have used another SVM implementation, tweaked the parameters a bit, ...
As stated in the paper's abstract, the author " rely on the OpenCV implementations of k-Nearest Neighbor, Random Forests, and Support Vector Machines classifiers." "I (the author) used the train_auto function of OpenCV, which splits the training set into a certain number of subsets, and perform cross-validation to optimize the parameters, by repeated training and testing using each time one of the subsets as testing set and the others as training set." Parameters used after optimization: C = 10 and epsilon = 0.01.
Do you really know which samples of the whole MNIST dataset he used as training and test sets? Because that's another thing to have in mind, more if using raw pixel info which is not consistent and robust. That's probably another source of discrepancy on the results