Low accuracy of SVM on Android
Hello guys,
I have an Android project that uses the face detection feature (Cascade Classifier). After recognizing the face, I trim the eyes and use a descriptor (Kaze) in conjunction with Bag of Words and SVM to recognize open and closed eyes, left and right. I performed the SVM training with a set of 1600 images, 400 of each type.
The training is done in C#, and after finishing the training, I test with some images, and all are classified correctly.
However, when importing the resulting dictionary and svm files, using the camera results have been very bad.
Here are the images and files generated through training:
https://github.com/caiocanalli/OpenCV...
Attached, the C#, Java, and C ++ code I'm using.
Any suggestion of how to improve preaching and welcome.
Thank you.
public class Training
{
private const string Path = "C:/Users/Administrator/Desktop/Data";
private readonly KAZE _kaze;
private readonly BFMatcher _bfMatcher;
private readonly BOWKMeansTrainer _bowKMeansTrainer;
private readonly BOWImgDescriptorExtractor _bowImgDescriptorExtractor;
private readonly Dictionary<int, string> _images;
private Mat _kazeDescriptors;
private Mat _bowDescriptors;
private Matrix<int> label;
public Training()
{
int dictionarySize = 10000;
_kaze = new KAZE(true, true, 0.00001F);
_bfMatcher = new BFMatcher(DistanceType.L2);
_bowKMeansTrainer = new BOWKMeansTrainer(
dictionarySize,
new MCvTermCriteria(200, 0.00001),
1,
KMeansInitType.PPCenters);
_bowImgDescriptorExtractor =
new BOWImgDescriptorExtractor(_kaze, _bfMatcher);
_kazeDescriptors = new Mat();
_bowDescriptors = new Mat();
_images = new Dictionary<int, string>
{
{ 1, "/closedLeftEyes" },
{ 2, "/openLeftEyes" },
//{ 1, "/closedRightEyes" },
//{ 2, "/openRightEyes" }
};
}
public void Start()
{
// Compute KAZE descriptors
System.Console.WriteLine("Compute KAZE descriptors...");
foreach (var item in _images)
{
var files = Directory.GetFiles(
Path + item.Value, "*.png");
System.Console.WriteLine("Directory: " + item.Value + ". Size: " + files.Length);
foreach (var file in files)
{
var image = new Image<Gray, byte>(file);
Mat descriptors = new Mat();
MKeyPoint[] keyPoints = _kaze.Detect(image);
_kaze.Compute(image,
new VectorOfKeyPoint(keyPoints), descriptors);
_kazeDescriptors.PushBack(descriptors);
System.Console.WriteLine("Imagem: " + file);
}
}
_bowKMeansTrainer.Add(_kazeDescriptors);
// Cluster dictionary
System.Console.WriteLine("Cluster dictionary...");
Mat dictionary = new Mat();
_bowKMeansTrainer.Cluster(dictionary);
_bowImgDescriptorExtractor.SetVocabulary(dictionary);
FileStorage fs = new FileStorage(
"dictionary_left.xml", FileStorage.Mode.Write);
fs.Write(dictionary, "dictionary");
fs.ReleaseAndGetString();
//FileStorage fs = new FileStorage(
// "dictionary_right.xml", FileStorage.Mode.Write);
//fs.Write(dictionary, "dictionary");
//fs.ReleaseAndGetString();
// Compute BOW descriptors
System.Console.WriteLine("Compute BOW descriptors...");
var labels = new List<int>();
foreach (var item in _images)
{
var files = Directory.GetFiles(
Path + item.Value, "*.png");
System.Console.WriteLine("Directory: " + item.Value + ". Size: " + files.Length);
foreach (var file in files)
{
var image = new Image<Gray, byte>(file);
Mat descriptors = new Mat();
MKeyPoint[] keyPoints = _kaze.Detect(image);
_bowImgDescriptorExtractor.Compute(image,
new VectorOfKeyPoint(keyPoints), descriptors);
_bowDescriptors.PushBack(descriptors);
labels.Add(item.Key);
System.Console.WriteLine("Image: " + file);
}
}
label = new Matrix<int>(labels.ToArray());
// Train SVM
System.Console.WriteLine("Train SVM...");
SVM svm = new SVM();
svm.SetKernel(SVM.SvmKernelType.Rbf);
svm.Type = SVM.SvmType.CSvc;
svm.TermCriteria = new MCvTermCriteria(400, 0.00001);
TrainData trainData = new TrainData(
_bowDescriptors,
Emgu.CV.ML.MlEnum.DataLayoutType.RowSample,
label);
System.Console.WriteLine("C: " + svm.C + " Gamma: " + svm.Gamma);
bool result = svm.TrainAuto(trainData);
System.Console.WriteLine("C: " + svm.C + " Gamma: " + svm.Gamma);
svm.Save("svm_left.xml");
//svm.Save("svm_right.xml");
if (!result)
throw new ...
... and STOP using
#include <opencv/cv.h>
, please ;)Hi again @berak. Always helping me rsrs.
10000 features, but only 800 images (per side) Actually, I've tried several values. I already searched the ideal number, but I did not find any answer. If so, do you have any suggestion of value for this number of samples?
the train images seem to come from some database, and probably don't fit your real life situation I cut out the images from the web, from various sources. I resized each image and took care not to crop repeated images. A lot of work =/. Actually, I tried to use the set you suggested here, but the precision of it was worse.
your data isn't normalized Sorry for my ignorance, but what do you suggest here?
VLAD is an improvement on BOW, also HOG My main fear is that I'm implementing BoW incorrectly.
... and STOP using #include <opencv cv.h=""> , please ;) It has already been removed. Not being used =)
i can't see much wrong in the code here. but maybe it's not such a great improvement, as you hoped ;(
try to (L2 or minmax) normalize each feature in the train and test data to [0..1], that is: