Making Object Detection Faster
Hello, I am currently trying out the deep neural network in OpenCV 3.3.0
I am currently trying out object detection with dnn
However, my code seems to run 1 frame per 10 seconds!! (Literally).
Can someone please tell me if it's just my slow computer or if it is that my code is not well written? Thanks in advance.
Here is my code (By the way, my computer has 4GB of RAM):
#include <opencv2/dnn.hpp>
#include <opencv2/dnn/shape_utils.hpp>
#include <opencv2/imgproc.hpp>
#include <opencv2/highgui.hpp>
#include <fstream>
#include <iostream>
#include <algorithm>
using namespace std;
using namespace cv;
using namespace cv::dnn;
const char* classNames[] = { "background",
"aeroplane", "bicycle", "bird", "boat",
"bottle", "bus", "car", "cat", "chair",
"cow", "diningtable", "dog", "horse",
"motorbike", "person", "pottedplant",
"sheep", "sofa", "train", "tvmonitor" };
int main()
dnn::Net net = readNetFromCaffe("deploy.prototxt", "VGG_VOC0712_SSD_300x300_ft_iter_120000.caffemodel");
VideoCapture cap(0);
while (true)
Mat frame;
cap >> frame;
if (frame.empty())
if (frame.channels() == 4)
cvtColor(frame, frame, COLOR_BGRA2BGR);
Mat inputBlob = blobFromImage(frame, 1.0f, Size(300, 300), Scalar(104, 117, 123), false); //! [Set input blob]
net.setInput(inputBlob, "data");
Mat detection = net.forward("detection_out");
Mat detectionMat(detection.size[2], detection.size[3], CV_32F, detection.ptr<float>());
float confidenceThreshold = 0.5;
for (int i = 0; i < detectionMat.rows; i++)
float confidence =<float>(i, 2);
if (confidence > confidenceThreshold)
size_t objectClass = (size_t)(<float>(i, 1));
int xLeftBottom = static_cast<int>(<float>(i, 3) * frame.cols);
int yLeftBottom = static_cast<int>(<float>(i, 4) * frame.rows);
int xRightTop = static_cast<int>(<float>(i, 5) * frame.cols);
int yRightTop = static_cast<int>(<float>(i, 6) * frame.rows);
ostringstream ss;
ss << confidence;
String conf(ss.str());
Rect object(xLeftBottom, yLeftBottom,
xRightTop - xLeftBottom,
yRightTop - yLeftBottom);
rectangle(frame, object, Scalar(0, 255, 0));
String label = String(classNames[objectClass]) + ": " + conf;
int baseLine = 0;
Size labelSize = getTextSize(label, FONT_HERSHEY_SIMPLEX, 0.5, 1, &baseLine);
rectangle(frame, Rect(Point(xLeftBottom, yLeftBottom - labelSize.height),
Size(labelSize.width, labelSize.height + baseLine)),
Scalar(255, 255, 255), CV_FILLED);
putText(frame, label, Point(xLeftBottom, yLeftBottom),
FONT_HERSHEY_SIMPLEX, 0.5, Scalar(0, 0, 0));
imshow("detections", frame);
if (waitKey(1) >= 0) break;
return 0;
it's not your code, but that is a huge network, using large images.
i guess, you'll have to wait, until they figure, how to get proper ocl/gpu optimization for that
You've tried one of the largest object detection models. Even on modern CPUs it achieves no more than 5FPS. You may try to test another model like MobileNet-SSD (checkout Additionally you may consider that object detection models usually can work with any size input image (of course with different accuracy). Experiment with input image size.
@dkurt So does the type of
model I use affect the speed and accuracy? And I'll try out the MobileNet-SSD!@dkurt Yes, the
seems to be faster with a fps of 1.8 To make it faster, would installing opencv 3.3.1 and runningyolo v2
help??@kimchiboy03 I don't think so. Try to change input's size firstly. In example, for a frame of size 640x480, try to downscale it to 320x240 or less keeping the same aspect ratio.
Please, where can I download both files : "deploy.prototxt" and "VGG_VOC0712_SSD_300x300_ft_iter_120000.caffemodel"
@Sebyazid , please do not post answers, if you have a question or comment, thank you.