How does DNN Module handle large input image sizes for object detection?

asked 2018-07-18 06:33:58 -0600

dro
21 ●1

I am running the TensorFlow SSD-Inception V2 COCO object detection model on images captured from a 4K camera. After cropping the image vertically, my input image size is 3840x1400. I have found that when using the model in OpenCV, I still get successful/valid detections when passing in the entire 3840x1400 input into the CNN without resizing it and specifying (3840, 1400) as the size in my cv2.dnn.blobFromImages call. However, when I run the same model in TensorFlow, it first resizes the input to 300x300 and therefore I miss a lot of detections in my huge image.

My question is: how is OpenCV handling this large input size successfully? Is it tiling the image first, or does it modify the network at all to handle any size input image?

Thanks!

edit retag flag offensive close merge delete

add a comment

1 answer

Sort by » oldest newest most voted

answered 2018-07-18 08:38:11 -0600

dkurt

1424 ●7 ●17

@dro, OpenCV honestly process all the input image. As you said, TensorFlow graph contains preprocessing resize node. Probably you may achieve well results in TensorFlow too if you delete this node.

BTW, what about efficiency of processing so large inputs in OpenCV?

edit flag offensive delete link

Comments

Thanks @dkurt. What do you mean by "honestly process all the input image"? It takes ~4-4.5 seconds to process the image on my laptop CPU, but I haven't yet experimented with using OpenCL as the target or using the Intel CV SDK/OpenVINO...