Not correctly configuring ANN training data and layers

asked 2020-08-15 17:54:14 -0600

updated 2020-08-16 01:44:27 -0600

berak
32993 ●7 ●81 ●312

I have about 1100+ images of size 64x64 that I'd like to classify as class 1.0 (positive) or -1.0 (negative) with a neural network from cv::ml::ANN_MLP. The input layer should have a size of 64x64=4096 and the output layer has a size of 2 for 1.0 and -1.0. I managed to get classification working with SVMs, KNNs and R-Trees, but I appear to be making mistakes now that I am configuring a NN. That section is found in the code snippet found below where I commented: "//required neural network settings".

#include <iostream>
#include <iomanip>
#include <opencv2/core.hpp>
#include <opencv2/highgui.hpp>
#include <opencv2/imgproc.hpp>
#include <opencv2/objdetect.hpp>
#include <opencv2/ml.hpp>
#include <fstream>
#include <string>
using namespace std;
int main()
{
    try
    {
    //********************Just reading images********************
    //read all .png images from the directory specied by prefix, 
    //image names are numbers from 0-599
    std::string prefix = "true/";
    std::string ext = ".png";
    cv::Mat data, labels;
    // loading positive training data
    int numPosImages = 600;
    cout << "loading images" << std::endl;
    for (long i = 0; i < numPosImages; i++) {
        std::string name(prefix);
        std::ostringstream ss; 
        ss << i; 
        name += ss.str();
        name += ext;
        cv::Mat m = cv::imread(name, cv::IMREAD_GRAYSCALE);
    m.convertTo(m, CV_32F);
    m = m.reshape(0, 1);
    data.push_back(m); // 1d single row
    labels.push_back(1);
    }
    // loading negative training data
    //read all .png images from the directory specied by prefix,
    //image names are numbers from 0-596
    std::string nprefix = "false/";
    int numNegImages = 596;
    for (long i = 0; i < numNegImages; i++) {
        std::string name(nprefix);
        std::ostringstream ss;
        ss << i;
        name += ss.str();
        name += ext;
        cv::Mat m = cv::imread(name, cv::IMREAD_GRAYSCALE);
        m.convertTo(m, CV_32F);
        m = m.reshape(0, 1);
        data.push_back(m); // 1d single row
        labels.push_back(-1);
    }
//required neural network settings
data.convertTo(data, CV_32F);
labels.convertTo(labels, CV_32F);
cv::Ptr <cv::ml::ANN_MLP> ai = cv::ml::ANN_MLP::create();
cv::Mat layers(1, 3, CV_32SC1);
layers.at<int>(0, 0) = data.cols; //Enter the number of layer features
layers.at<int>(0, 1) = 16;
layers.at<int>(0, 2) = 2;
ai->setLayerSizes(layers);
ai->setActivationFunction(cv::ml::ANN_MLP::SIGMOID_SYM);
ai->setTrainMethod(cv::ml::ANN_MLP::BACKPROP);
// prepare the training data
cv::Ptr<cv::ml::TrainData> trainingData = cv::ml::TrainData::create(data, cv::ml::SampleTypes::ROW_SAMPLE, labels);
// ai training
cout << "training ai" << std::endl;
ai->train(trainingData);

The message I get in the console is:

OpenCV(4.4.0) Error: Bad argument (output training data should be a floating-point matrix with the number of rows equal to the number of training samples and the number of columns equal to the size of last (output) layer) in cv::ml::ANN_MLPImpl::prepare_to_train.

Which I don't understand... I've converted the training data and labels to float and how else am I to have 4096 inputs and 2 outputs?

edit retag flag offensive close merge delete

add a comment

answered 2020-08-15 21:08:05 -0600

berak
32993 ●7 ●81 ●312

updated 2020-08-16 01:46:40 -0600

the ANN needs "one-hot encoded" labels, 2 colums (one for each class) like [1,0] or [0,1] for -1 and 1 resp.

you could add a loop before the training to translate your current labels like:

Mat one_hot(labels.rows, 2, CV_32F, 0.0f); // 2 classes, all 0
for (int i=0; i<labels.rows; i++) {
    int k = labels.at<int>(i); // careful with the type here !
    if (k==-1) one_hot.at<float>(0) = 1;
    else       one_hot.at<float>(1) = 1;
}

cv::Ptr<cv::ml::TrainData> trainingData = cv::ml::TrainData::create(data, cv::ml::SampleTypes::ROW_SAMPLE, one_hot); // instead of "labels"

please also have a look at this sample

(and yea, "output training data should be a floating-point matrix" is a bit misleading, it gets better if you think of it as "ground truth responses for the output", maybe ...)

edit flag offensive delete link

add a comment

Not correctly configuring ANN training data and layers

1 answer

Links

Question Tools

Stats

Related questions

Not correctly configuring ANN training data and layers edit

1 answer

Links

Question Tools

Stats

Related questions

Not correctly configuring ANN training data and layers