Ask Your Question

DNN questions

asked 2018-05-20 12:46:21 -0500

sjhalayka gravatar image

updated 2018-05-20 13:29:12 -0500

I'm trying to understand a code found in berak's answer to a previously asked question:

Thanks again to berak for sharing this code with us. The way that berak chops off the DNN's last few layers and attaches on a standard MLP ANN, it's like surgery. Nice work.

I hope that someone can answer a few questions that I have about the code:

1) How do you know that SqueezeNet has 67 layers, and how do you print out the properties of the last 10 layers?

2) There is code that states:

Mat_<int> layers(4, 1);
layers << 1000, 400, 100, 2; // the sqeezenet pool10 layer has 1000 neurons

So 1000 is the number of neurons in the first hidden layer, which attaches to the pool10 layer? Or does this 1000 neuron layer replace pool10?

And 2 is the number of one-hot encoding variables (two neurons, one per class), right?

How does one decide upon on 400 and 100 for the other hidden layers? Rule of thumb?

edit retag flag offensive close merge delete



what's the linker error about ?

berak gravatar imageberak ( 2018-05-20 13:21:53 -0500 )edit

Sorry about that. I was in the midst of writing a question about linker errors, but I've since figured out the problem. I was linking to an older version of OpenCV, but now I'm linking to 3.4.x and it works fine. Anyway, it kept the data for my previous answer and I forgot to also alter the header.

sjhalayka gravatar imagesjhalayka ( 2018-05-20 13:34:11 -0500 )edit

don't feel bad, since you solved it on your own !

berak gravatar imageberak ( 2018-05-20 13:40:09 -0500 )edit

1 answer

Sort by » oldest newest most voted

answered 2018-05-20 13:20:57 -0500

berak gravatar image

hi, @sjhalayka.

1) How do you know that squeezenet has 67 layers, and how do you print out the properties of the last 10 layers?

// 1st, we have to load the darn thing:
std::string modelTxt = "c:/data/mdl/squeezenet/deploy.prototxt";
std::string modelBin = "c:/data/mdl/squeezenet/squeezenet_v1.1.caffemodel";
Net net = dnn::readNetFromCaffe(modelTxt, modelBin);

// it's a long story, but but the network won't know it's (final) size, unless you give it an input image
Mat img = imread(imageFile);
Mat inputBlob = blobFromImage(img, 1.0, Size(227,227), Scalar(), false); 
// ^ yep, you have to *know*, which size it was trained upon, originally. magic here !

// this is more or less the "standard processing"
net.setInput(inputBlob);                    // Set the network input.
Mat prob = net.forward("prob");             // Compute output.

// now we can iterate over the internal layers, and print out it's properties:
MatShape ms1 { inputBlob.size[0], inputBlob.size[1] , inputBlob.size[2], inputBlob.size[3] };
size_t nlayers = net.getLayerNames().size() + 1;        // one off for the hidden input layer
for (size_t i=0; i<nlayers; i++) {
    Ptr<Layer> lyr = net.getLayer((unsigned)i);
    std::vector<MatShape> in,out;
    cout << format("%-38s %-13s ", (i==0?"data":lyr->name.c_str()), (i==0?"Input":lyr->type.c_str()));
    for (auto j:in)  cout << "i" << Mat(j).t() << "  "; // input(s) size
    for (auto j:out) cout << "o" << Mat(j).t() << "  "; // output(s) size
    for (auto b:lyr->blobs) {                           // what the net trains on, e.g. weights and bias
        cout << "b[" << b.size[0];
        for (size_t d=1; d<b.dims; d++) cout << ", " << b.size[d];
        cout << "]  ";
    cout << endl;

2) There is code that states:

Mat_<int> layers(4, 1);
layers << 1000, 400, 100, 2;

yea pure guessworks here (we only know, that there are 1000 inputs, and 2 outputs) and from experience, it seems better to have2 hidden layers here, so we can break it down from 1000 -> 400 ->100 -> 2, instead of going from 1000 -> 2 directly.

edit flag offensive delete link more


So what would one do if there were 10,000 training images? Would this still work?

sjhalayka gravatar imagesjhalayka ( 2018-05-20 13:44:05 -0500 )edit

And thanks again for your code!!!

sjhalayka gravatar imagesjhalayka ( 2018-05-20 13:50:41 -0500 )edit

IF you have that many, samples, , you might reconsider training the whole thing from scratch

unfortunately, opencv does not have any means to do so, please look at tensorflow, caffe, torch ,etc.

berak gravatar imageberak ( 2018-05-20 14:01:13 -0500 )edit

OK, right on. I got free books from Packt: Python Machine Learning and Advanced Machine Learning with Python. I'll have to read them and give DNNs a go.

sjhalayka gravatar imagesjhalayka ( 2018-05-20 14:04:14 -0500 )edit

I also wonder if 1000 neurons can encode 2^1000 features, or is it just 1000 features?

Does this question even make sense @berak ?

I just wanted to thank you again for all of the code and expertise!

sjhalayka gravatar imagesjhalayka ( 2018-05-20 14:34:34 -0500 )edit

those 1000 numbers there are not really binary (it's still some sort of "spectrum"). but the next layer (the "softmax" is quite, like you decribe it, 1000 "bits", on or off (well, only one will be "on", all the others off.)

remember, imagenet has 1000 classes, so one neuron/bit per class in the output.


berak gravatar imageberak ( 2018-05-21 01:36:18 -0500 )edit

Thanks again! Looks like I’m going to have to do my own DNN from scratch.

sjhalayka gravatar imagesjhalayka ( 2018-05-21 17:34:48 -0500 )edit

So the pool10 layer has 1000 one-hot encoding neurons because there are 1000 classes?

sjhalayka gravatar imagesjhalayka ( 2018-05-21 19:59:51 -0500 )edit

there are 1000 classes, but the pool10 layer is NOT one-hot encoded (that's what the following softmax layer would do) instead there are like "probabilities" for each class

berak gravatar imageberak ( 2018-05-22 01:04:31 -0500 )edit

Question Tools

1 follower


Asked: 2018-05-20 12:46:21 -0500

Seen: 206 times

Last updated: May 20 '18