newbie questions on real time detection

asked 2018-04-28 09:22:50 -0500

holger gravatar image

updated 2018-04-28 10:17:42 -0500


So i imported a model (darknet) and i can do classification and object detection. So my next step would to do it on videos instead of the pictures. Well my first naive approach was to do it on every frame, thas is horrible slow My model: darknet yolo v2

    //bigger blob sizes make it slower
    Mat blob = blobFromImage(img, 1 / 255.F, new Size(416, 416), new Scalar(0), true, true);

    long start = System.currentTimeMillis();
    Mat detectionMat = net.forward();//compute output
    long end = System.currentTimeMillis();
    System.err.println("net detection took: "+(end-start)+"ms");

The output is :

net detection took: 1842ms
net detection took: 1058ms
net detection took: 1176ms
net detection took: 1171ms

Here my questions:

  1. I know its hard to say - but are the detection times "slow" ? If i would get it working with some delay its ok.
  2. What can i do to improve the performance?

I was thinking about creating a buffer using object dection "in the future" and then use object tracking api to follow it (hope its faster) once it reaches the frame - Is that a good idea(dont know)?

Thank you very much again + Greetings, Holger

edit retag flag offensive close merge delete


Ok first thing is to use a smaller model - with that i getting time around 200-300 ms. Still not enough for real time detection i think. Maybe i must strip down the model even more - i am interested in comparing values. Maybe the java wrapper also adds up oo(you see me code line above - no custom app code there).

holger gravatar imageholger ( 2018-04-28 22:01:31 -0500 )edit adding multithread support (i am using prebuild windows version) could also bring some speed?

holger gravatar imageholger ( 2018-04-28 22:05:00 -0500 )edit

i compared to yolo command line - times are here (compiled with multithreading OPENMP=1)

Enter Image Path: data/horses.jpg
data/horses.jpg: Predicted in 3.666006 seconds.
horse: 88%
horse: 74%
horse: 70%
horse: 53%
horse: 28%
horse: 28%
Enter Image Path: data/dog.jpg
data/dog.jpg: Predicted in 3.650407 seconds.

So i guess the open cv reading of darknet on cpu is better.

holger gravatar imageholger ( 2018-04-29 00:02:58 -0500 )edit