Ask Your Question
0

Bottleneck on dnn network .forward()

asked 2017-09-24 05:41:05 -0600

memeka gravatar image

Hi,

I’m doing object detection using the dnn module and OpenCV 3.3.

I’m getting 3fps on an ARM board (ssd + mobilenet), but I can’t figure out what’s the bottleneck. Here are my observations (same results for python and C++):

  • the board has 4 little and 4 big cores, but max FPS is achieved when running only on the big cores (using taskset). Making use of all 8 cores by increasing manually the thread number or allowing it to run on its own threads/cores combination, results in worse results

  • ram usage is not a problem

  • when using the 4 big cores, cpu usage doesn’t go more than 300% (out of 400%) with no core getting above 80% (out of 100%)

  • getting frames from webcam is not an issue, I tried grabbing in different thread but there was no change in results

So, any idea how can I tune my code to get 100% cpu usage? What can the bottleneck be? Memory speed? CPU cache? Here’s my board’s stats when running: https://m.imgur.com/a/D9tdp

Thanks.

edit retag flag offensive close merge delete

1 answer

Sort by » oldest newest most voted
0

answered 2017-09-26 06:42:55 -0600

memeka gravatar image

does anyone have any suggestion?

edit flag offensive delete link more

Question Tools

1 follower

Stats

Asked: 2017-09-24 05:37:35 -0600

Seen: 563 times

Last updated: Sep 26 '17