Making Object Detection Faster

asked 2017-12-18 03:26:55 -0600

kimchiboy03
38 ●3 ●5

Hello, I am currently trying out the deep neural network in OpenCV 3.3.0

I am currently trying out object detection with dnn.

However, my code seems to run 1 frame per 10 seconds!! (Literally).

Can someone please tell me if it's just my slow computer or if it is that my code is not well written? Thanks in advance.

Here is my code (By the way, my computer has 4GB of RAM):

#include <opencv2/dnn.hpp>
#include <opencv2/dnn/shape_utils.hpp>
#include <opencv2/imgproc.hpp>
#include <opencv2/highgui.hpp>
#include <fstream>
#include <iostream>
#include <algorithm>

using namespace std;
using namespace cv;
using namespace cv::dnn;

const char* classNames[] = { "background",
"aeroplane", "bicycle", "bird", "boat",
"bottle", "bus", "car", "cat", "chair",
"cow", "diningtable", "dog", "horse",
"motorbike", "person", "pottedplant",
"sheep", "sofa", "train", "tvmonitor" };

int main()
{
    dnn::Net net = readNetFromCaffe("deploy.prototxt", "VGG_VOC0712_SSD_300x300_ft_iter_120000.caffemodel");

    VideoCapture cap(0);

    while (true)
    {
        Mat frame;
        cap >> frame;

        if (frame.empty())
        {
            waitKey();
            break;
        }

        if (frame.channels() == 4)
        {
            cvtColor(frame, frame, COLOR_BGRA2BGR);
        }

        Mat inputBlob = blobFromImage(frame, 1.0f, Size(300, 300), Scalar(104, 117, 123), false);                                                                            //! [Set input blob]
        net.setInput(inputBlob, "data");
        Mat detection = net.forward("detection_out");

        Mat detectionMat(detection.size[2], detection.size[3], CV_32F, detection.ptr<float>());

        float confidenceThreshold = 0.5;
        for (int i = 0; i < detectionMat.rows; i++)
        {
            float confidence = detectionMat.at<float>(i, 2);

            if (confidence > confidenceThreshold)
            {
                size_t objectClass = (size_t)(detectionMat.at<float>(i, 1));

                int xLeftBottom = static_cast<int>(detectionMat.at<float>(i, 3) * frame.cols);
                int yLeftBottom = static_cast<int>(detectionMat.at<float>(i, 4) * frame.rows);
                int xRightTop = static_cast<int>(detectionMat.at<float>(i, 5) * frame.cols);
                int yRightTop = static_cast<int>(detectionMat.at<float>(i, 6) * frame.rows);

                ostringstream ss;
                ss.str("");
                ss << confidence;
                String conf(ss.str());

                Rect object(xLeftBottom, yLeftBottom,
                    xRightTop - xLeftBottom,
                    yRightTop - yLeftBottom);

                rectangle(frame, object, Scalar(0, 255, 0));
                String label = String(classNames[objectClass]) + ": " + conf;
                int baseLine = 0;
                Size labelSize = getTextSize(label, FONT_HERSHEY_SIMPLEX, 0.5, 1, &baseLine);
                rectangle(frame, Rect(Point(xLeftBottom, yLeftBottom - labelSize.height),
                    Size(labelSize.width, labelSize.height + baseLine)),
                    Scalar(255, 255, 255), CV_FILLED);
                putText(frame, label, Point(xLeftBottom, yLeftBottom),
                    FONT_HERSHEY_SIMPLEX, 0.5, Scalar(0, 0, 0));
            }
        }

        imshow("detections", frame);
        if (waitKey(1) >= 0) break;
    }

    return 0;
}

edit retag flag offensive close merge delete

Comments

it's not your code, but that is a huge network, using large images.

i guess, you'll have to wait, until they figure, how to get proper ocl/gpu optimization for that

berak ( 2017-12-18 03:43:46 -0600 )edit

You've tried one of the largest object detection models. Even on modern CPUs it achieves no more than 5FPS. You may try to test another model like MobileNet-SSD (checkout https://github.com/opencv/opencv/tree...). Additionally you may consider that object detection models usually can work with any size input image (of course with different accuracy). Experiment with input image size.

dkurt ( 2017-12-18 05:51:20 -0600 )edit

@dkurt So does the type of prototxt and caffe model I use affect the speed and accuracy? And I'll try out the MobileNet-SSD!

kimchiboy03 ( 2017-12-18 13:35:13 -0600 )edit

@dkurt Yes, the MobileNet-SSD seems to be faster with a fps of 1.8 To make it faster, would installing opencv 3.3.1 and running yolo v2 help??

kimchiboy03 ( 2017-12-18 13:50:28 -0600 )edit

@kimchiboy03 I don't think so. Try to change input's size firstly. In example, for a frame of size 640x480, try to downscale it to 320x240 or less keeping the same aspect ratio.

dkurt ( 2017-12-18 14:23:19 -0600 )edit

Please, where can I download both files : "deploy.prototxt" and "VGG_VOC0712_SSD_300x300_ft_iter_120000.caffemodel"

Sebyazid ( 2017-12-19 05:11:14 -0600 )edit

@Sebyazid , please do not post answers, if you have a question or comment, thank you.

berak ( 2017-12-19 05:42:55 -0600 )edit

add a comment

answered 2017-12-20 14:37:16 -0600

kimchiboy03
38 ●3 ●5

updated 2017-12-20 14:39:55 -0600

Thanks to @dkurt my fps went up to 5fps. Which is reasonable for my application.

Firstly, I used MobileNet-SSD which made the code faster with a fps of 1.8 You can download those files from here: prototext, caffemodel (Here you go @Sebyazid )

Then, I made the input size of the image smaller from 300 x 300 to 100 x 100. Although the accuracy goes down by a bit, the program is still capable of recognising objects.

Since I had to use the MobileNet-SSD, I used this Github sample

Thank you @dkurt

edit flag offensive delete link

Comments

have you tried using different backends like Halide / OpenCL? I've heard that they can accelerate the detection speed

tejasa97 ( 2018-08-24 23:45:21 -0600 )edit

add a comment

Making Object Detection Faster

Comments

1 answer

Comments

Links

Question Tools

Stats

Related questions

Making Object Detection Faster edit

Comments

1 answer

Comments

Links

Question Tools

Stats

Related questions

Making Object Detection Faster