Normal bayes classifier taking long to train Opencv C++

asked 2016-05-31 10:05:36 -0500

Angulu gravatar image

I am training normal bayes classifier using opencv 3.1.0 C++ to classify face images. I am passing a floating point matrix as classification data and a matrix of integers as labels (responses). Opencv does not throw any exception but the classifier takes a very long time without finishing classification. Below is my code. Kind regards

 void trainBayes(Mat hists, vector<int> labels)
    Mat responses(labels);
    responses.convertTo(responses, CV_32SC1);
    Ptr<NormalBayesClassifier> bayes = NormalBayesClassifier::create();
    cout << "Training Bayes..." << endl;
    bayes->train(hists, ml::ROW_SAMPLE, responses);
    bool trained = bayes->isTrained();
    if (trained)
        cout << "Normal Bayes Classifier is Trained" << endl;
        bayes->save(".\\Trained Models\\LBPBayes.xml");
edit retag flag offensive close merge delete



to my experience, a NormalBayesClassifier plays badly with long feature vectors (what are you giving it ? pixels ? that would be e.g. 10000 features for a 100x100 img) as well with accuracy and training time.

did you try alternatives, like SVM, ANN_MLP, KNearest ?

berak gravatar imageberak ( 2016-05-31 10:56:22 -0500 )edit

May be that could be the reason. My feature vector is 600 rows by 2304 cols Matrix. This means my features are 1,382,400. I have successfully trained ANN, SVM and KNN with the same feature vector.

Angulu gravatar imageAngulu ( 2016-05-31 11:10:00 -0500 )edit

I have reduced the feature vector to 2 x 16 (32) but still Normal Bayes Classifier is taking forever to train. Could there be any other issue apart from feature size?

Angulu gravatar imageAngulu ( 2016-05-31 12:06:21 -0500 )edit

Normal Bayes, as @berak said is not for large feature vectors. It is mainly used for simple post processing steps after exhaustive machine learning techniques to refine results. How much different data points are you supplying to the naive bayes? I normally use a set of like 100 samples at most, just because to much data will prolong the decision and building of the model for ages :)

StevenPuttemans gravatar imageStevenPuttemans ( 2016-06-01 05:51:58 -0500 )edit

Thank you so much for these comments! It saved my time to debug what's wrong with my training!

train_features.rows = 4544, train_features.cols = 2916
train_labels.rows = 4544, train_labels.cols = 1
test_features.rows = 1156, test_features.cols = 2916
test_labels.rows = 1156, test_labels.cols = 1
Training Bayes: trainset: [2916 x 4544] testset: [2916 x 1156]
mkc gravatar imagemkc ( 2016-07-21 22:29:06 -0500 )edit