Ask Your Question
2

Training CNN for NIST digits using tiny-dnn

asked 2018-04-12 01:58:45 -0600

lama123 gravatar image

I have been trying to train a CNN using tiny-dnn library for digit recognition. The database used was NIST 19. The

number of samples per class is 1000 for training and 30 for testing. So total number of samples for training is

1000*10=10000. OpenCV is used for image processing. The maximum accuracy obtained was 40%. Is this due to low number of samples? How to improve the accuracy?

The code is given below

ConvolutionalNN::train()
{
    network<sequential> net;

    // add layers
    net << conv(32, 32, 5, 1, 6) << tiny_dnn::activation::tanh()  // in:32x32x1, 5x5conv, 6fmaps
        << ave_pool(28, 28, 6, 2) << tiny_dnn::activation::tanh() // in:28x28x6, 2x2pooling
        << fc(14 * 14 * 6, 120) << tiny_dnn::activation::tanh()   // in:14x14x6, out:120
        << fc(120, 10);                     // in:120,     out:10

    assert(net.in_data_size() == 32 * 32);
    assert(net.out_data_size() == 10);

    DatabaseReader db;
    db.readTrainingFiles();

    // hold labels -> training filenames
    std::vector<int> labels = db.getTrainLabels();
    std::vector<std::string> trainingFilenames = db.getTrainFileNames();

    std::vector<label_t> train_labels;
    std::vector<vec_t> train_images;

    // loop over training files
    for(int index=0; index<trainingFilenames.size(); index++)
    {
        // output on which file we are training
        std::cout << "Analyzing label -> file: " <<  labels[index] << "|" <<  trainingFilenames[index] << std::endl;

        // read image file (grayscale)
        cv::Mat imgMat = cv::imread(trainingFilenames[index], 0);

        Mat nonZero;
        Mat invert = 255 - imgMat;
        findNonZero(invert, nonZero);
        Rect bb = boundingRect(nonZero);
        Mat img = invert(bb);

        int w=32, h=32,scale=1;
        cv::Mat resized;
        cv::resize(img, resized, cv::Size(w, h));

        imshow("img", resized);
        waitKey(30);
        //convert to float

        resized.convertTo(resized, CV_32FC1);
        cv::normalize(resized,resized, -1, 1, NORM_MINMAX);

        //convert to vec_t

        vec_t d;
        tiny_dnn::float_t *ptr = resized.ptr<tiny_dnn::float_t>(0);
        d = tiny_dnn::vec_t(ptr, ptr + resized.cols * resized.rows );

        train_images.push_back(d);
        train_labels.push_back(labels[index]);


    }

    // declare optimization algorithm
    adagrad optimizer;

    cout << "Training Started" << endl;

    // train (50-epoch, 30-minibatch)
    net.train<mse, adagrad>(optimizer, train_images, train_labels, 30, 50);

    cout << "Training Completed" << endl;



    // save
    net.save("net");

}

Thanks Amal

edit retag flag offensive close merge delete

Comments

  • 1000 images per class is not a low number (also, given how small your network is)
  • 50 epochs might not be enough, try to watch the training progress, you can add callbacks to the train method, and calculate the loss per epoch, or similar. what is your learning rate ?
  • make sure to use exactly the same preprocessiing for test & train samples.
  • idk NIST 19, but for handwritten characters, deskewing is a good idea.
berak gravatar imageberak ( 2018-04-12 02:29:37 -0600 )edit
1

ok.. i will try this out The mnist database example in the tiny-dnn website does a -1 to 1 normalisation. This is the reason for doing so.

lama123 gravatar imagelama123 ( 2018-04-12 02:56:05 -0600 )edit

yes, i've seen that, too. (and redacted that part of the comment)

berak gravatar imageberak ( 2018-04-12 03:14:11 -0600 )edit

and in mnist digits they are getting more than 98% accuracy with this architecture..so the doubt.

lama123 gravatar imagelama123 ( 2018-04-12 03:16:41 -0600 )edit

"in mnist digits they are getting more than 98% accuracy" a link?

32x32 in mnist data base? it is 28*28

LBerger gravatar imageLBerger ( 2018-04-12 03:32:27 -0600 )edit
2

link text

This is the link

The change in size should be because of padding

lama123 gravatar imagelama123 ( 2018-04-12 03:47:08 -0600 )edit

@LBerger, yes, the original images are 28x28, but they add 2px padding

berak gravatar imageberak ( 2018-04-12 03:57:57 -0600 )edit
1

Using caffe with 1000*10 images( I'm not sure that there is 1000 images per class) I have got

**********************************************************
I0412 11:08:25.129899 10352 solver.cpp:330] Iteration 40000, Testing net (#0)
I0412 11:08:25.145503 10352 net.cpp:676] Ignoring source layer data
I0412 11:08:25.551775 10352 solver.cpp:397]     Test net output #0: accuracy = 0.97
I0412 11:08:25.551775 10352 solver.cpp:397]     Test net output #1: loss = 0.102418 (* 1 = 0.102418 loss)

and proto.txt is here

LBerger gravatar imageLBerger ( 2018-04-12 04:15:19 -0600 )edit

That is 97% accuracy..on mnist.. right So I wonder why we are not getting accuracy on NIST (link text

lama123 gravatar imagelama123 ( 2018-04-12 05:46:23 -0600 )edit

In database do you read only digit (30->39) and do you shuffle data?

LBerger gravatar imageLBerger ( 2018-04-12 07:05:05 -0600 )edit

1 answer

Sort by ยป oldest newest most voted
2

answered 2018-04-12 09:38:19 -0600

lama123 gravatar image

The problem was with high learning rate. When the learning rate was changed from the default value of 0.01 to 0.0001 the accuracy increased from 40% to 83.6 %

The changed line was

optimizer.alpha = static_cast<tiny_dnn::float_t>(0.0001);

Thanks a lot @LBerger for the support

Amal

edit flag offensive delete link more

Question Tools

1 follower

Stats

Asked: 2018-04-12 01:58:45 -0600

Seen: 813 times

Last updated: Apr 12 '18