Revision history [back]

Adding speed as a feature in an image based ANN

Good day everyone, thanks for taking the time to look into my question.

based on this example : http://answers.opencv.org/question/119300/how-to-start-with-neural-network-implementation-with-opencv-and-c/

I've started implementing an ANN classification to drive a car in unity.

I read the camera feed as I control the car to record my training data.

image description

Then I train a neural network using this code :

int nclasses = 5; 
String att = FOLDER; 
vector<String> fn;

Mat train_data, train_labels, test_data, test_labels;
int cnt = 0;
for (int p = 0; p < nclasses; p++)
{ 
    cerr << "p " << p << "\r";

    glob(att + std::to_string(p), fn, false);

    for (int i = 0; i < fn.size(); i++)
    {
        cv::Mat image = cv::imread(fn[i], 0);

        if (image.empty()) {
            cerr << "no !" << fn[i] << endl; continue;
        }

        image.convertTo(image, CV_32F);//1.0/255);
        resize(image, image, Size(80, 80));

        Mat feature = image;

        train_data.push_back(feature.reshape(1, 1));
        train_labels.push_back(p);
    }
}

// setup the ann:
int nfeatures = train_data.cols;
Ptr<ml::ANN_MLP> ann = ml::ANN_MLP::create();
Mat_<int> layers(4, 1);
layers(0) = nfeatures;     // input
layers(1) = nclasses * 8;  // hidden
layers(2) = nclasses * 4;  // hidden
layers(3) = nclasses;      // output, 1 pin per class.
ann->setLayerSizes(layers);
ann->setActivationFunction(ml::ANN_MLP::SIGMOID_SYM, 0, 0);
ann->setTermCriteria(TermCriteria(TermCriteria::MAX_ITER + TermCriteria::EPS, 300, 0.0001));
ann->setTrainMethod(ml::ANN_MLP::BACKPROP, 0.0001);

// ann requires "one-hot" encoding of class labels:
Mat train_classes = Mat::zeros(train_data.rows, nclasses, CV_32FC1);
for (int i = 0; i < train_classes.rows; i++)
{
    train_classes.at<float>(i, train_labels.at<int>(i)) = 1.f;
}
cerr << train_data.size() << " " << train_classes.size() << endl;

ann->train(train_data, ml::ROW_SAMPLE, train_classes);

ann->save("output.ann");
return 0;

At runtime I then read the camera feed and predict if I should go forward, left, right... https://youtu.be/PB5NiIGFTNo

Using only 3500 images (2 laps) I have the result shown in the video.

My question is,

How can I add the speed as a feature, so that the same image won't output the same classification depending of the speed. I'm using 80x80 image, so 6400 floats, I'm afraid adding the speed as 1 float won't have any weight in the calculation.

Speed is an example, but friction, weather, mass of the car, stuff like that could also be added during the training phase.

Thanks you for your help.