# dnn openpose sample - expected image resolution

In the dnn/openpose.cpp using the network definition from CMU at prototxt link the input layer shape is given as [1,3,368,368].

Does it mean that the network is expecting a square image as an input? What happens if an image is given in its original aspect ratio?

The current implementation of the openpose library defines the input layer shape at runtime: prototxt link, is it possible to do something similar with opencv?

edit retag close merge delete

Sort by » oldest newest most voted

Does it mean that the network is expecting a square image as an input?

it means it will resize the input image to 368x368 (the network was trained on this) , you don't need to do this, it is done in blobFromImage()

changing the width or height values might not be a good idea (for this pretrained model), as the resulting heatmap shapes will change, and the overall accuracy will go down.

more

Thank you. From the blobFromImage() definition and leaving the default parameters, it looks like the image will be first resized uniformly so that the smallest dimension is equal to 368, and then cropped from the center.

For a standard widescreen 1280x720 image, that means that the left and right sides of the image will be cropped away, so no person should be detected if it's outside the central region, is that correct?

( 2018-05-17 04:15:35 -0500 )edit
1

yes, you're right about the cropping.

maybe using inages that large is a bad idea in the 1st place here, since the heatmaps are only 46x46. so, the larger the image, the larger the position error

( 2018-05-17 04:22:11 -0500 )edit

Official site

GitHub

Wiki

Documentation