MLP sigmoid output +/-epsilon

asked 2014-09-15 06:51:13 -0600

thomas gravatar image

This may seem like a duplicate question to this, but the difference is that there I was asking whether the output range is [-1,1] or [0,1]. I have accepted that the range is [0,1] if the the activation function is the sigmoid with alpha != 0 and beta != 0 (as stated in the documentation). Anyway, it seems to me that the output range is more like [0-eps, 1+eps].

My question is: Why is there a small epsilon and how can I turn this off?

One thing I could think of is that the output neurons aren't sigmoid units but linear units. Although it is explicitly stated that all neurons have the same activation function, this could explain this behavior.

Here is a small example that shows what I mean:

#include <iostream>
#include <opencv2/core/core.hpp>
#include <opencv2/ml/ml.hpp>

using namespace cv;
using namespace std;

int main() {

    int POS = 1, NEG = 0;

    int SAMPLES = 100;
    float SPLIT = 0.8;

    float C_X = 0.5;
    float C_Y = 0.5;
    float R = 0.3;

    Mat X(SAMPLES, 2, CV_32FC1);
    Mat Y(SAMPLES, 1, CV_32FC1);

    randu(X, 0, 1);

    for(int i = 0; i < SAMPLES; i++){
        Y.at<float>(i,0) = pow((X.at<float>(i,0) - C_X),2) + pow((X.at<float>(i,1) - C_Y),2) < pow(R,2) ? POS : NEG;
    }

    Mat X_train = X(Range(0, (int)(SAMPLES*SPLIT)), Range::all());
    Mat Y_train = Y(Range(0, (int)(SAMPLES*SPLIT)), Range::all());

    Mat X_test = X(Range((int)(SAMPLES*SPLIT), SAMPLES), Range::all());
    Mat Y_test = Y(Range((int)(SAMPLES*SPLIT), SAMPLES), Range::all());

    CvANN_MLP_TrainParams params(
                 cvTermCriteria(CV_TERMCRIT_ITER+CV_TERMCRIT_EPS, 1000, 0.000001),
                 CvANN_MLP_TrainParams::BACKPROP,
                 0.1,
                 0.1);

    Mat layers = (Mat_<int>(3,1) << 2, 4, 1);

    CvANN_MLP net(layers, CvANN_MLP::SIGMOID_SYM, 1, 1);
    net.train(X_train, Y_train, Mat(), Mat(), params);

    Mat predictions(Y_test.size(), CV_32F); 
    net.predict(X_test, predictions);

    cout << predictions << endl;

    Mat error = predictions-Y_test;
    multiply(error, error, error);

    float mse = sum(error)[0]/error.rows;

    cout << "MSE: " << mse << endl;

    return 0;
    }

For me this produces the following output:

[0.9940818;
0.087859474;
0.072328083;
0.032660298;
-0.0090373717;
0.056480117;
0.13302;
-0.025581671;
0.32763073;
1.0263158;
0.29676101;
0.056798562;
0.070351392;
1.0213233;
0.006240299;
0.96525788;
0.071746305;
1.0048869;
-0.015669812;
0.0023532249]
MSE: 0.0326775

As you can see, there are values just below 0 and above 1.

edit retag flag offensive close merge delete

Comments

I have a similar problem. Any comments/replies by experts on this issue?

Jeannie14 gravatar imageJeannie14 ( 2014-09-26 04:09:42 -0600 )edit