Ask Your Question

Revision history [back]

Caffe Input Size for Faster R-CNN Models

Hi, I was using OpenCV's dnn module to run inferences on images, and it is mentioned in sample here, that an input Size is also, necessary, to which the image will be resized and then fed into the network. For MobileNetSSD this is 300x300 and the same is mentioned in it's prototxt file below:

name: "MobileNet-SSD"
input: "data"
input_shape {
    dim: 1
    dim: 3
    dim: 300
    dim: 300
}

So, the input shape here required is 300x300 and hence the input 300 X 300 is mentioned in samples README. But, for Faster R-CNN, the following prototxt file is given in opencv_extra:

# Faster-RCNN network. Based on https://github.com/rbgirshick/py-faster- 
 rcnn/blob/master/models/pascal_voc/VGG16/faster_rcnn_alt_opt$
 name: "VGG_ILSVRC_16_layers"

 input: "data"
 input_shape {
    dim: 1
    dim: 3
    dim: 224
    dim: 224
}

So, the network requires an image of size 224x224, but in the samples the input size is mentioned to be 800x600. So, i tried it with 800x600 and it worked, it also worked with 800x800 and the default frame size, and I am getting same inferences.

So does openCV automatically resizes the image so as to suit the input sizes of network? If not then why is faster R-CNN working with any size provided ?

I used OpenCV's master branch (says OpenCV 4.0 ) and the sample C++ object detection code given in samples.

Thanks!