embedded_ssd_mobilenet_v1 using in opencv c++ version

asked 2018-02-28 02:00:06 -0500

updated 2018-02-28 03:23:16 -0500

I want to use the embedded_ssd_mobilenet_v1 model by opencv c++ version ,which is trained in tensorflow python,I ever tried the same thing but the model is ssd_mobilenet instead of the embedded version,although I met some problem but finally solved with the help of other people.But because it's too time consuming on my platform(tinkerboard),so I want to use the embedded version to get detection faster,I use the tools offered in this site to generate a .pbtxt file,then I use the pb and pbtxt file in the following code:

#include <opencv2/opencv.hpp>
#include <opencv2/dnn.hpp>
using namespace std;
using namespace cv;

const size_t inWidth = 300;
const size_t inHeight = 300;
const float WHRatio = inWidth / (float)inHeight;
const char* classNames[] = { "background","face" };

int main() {

String weights = "/data/bincheng.xiong/detect_head/pbfiles_embedded/pbfile1/frozen_inference_graph.pb";
String prototxt = "/home/xbc/embedded_ssd.pbtxt";
dnn::Net net = cv::dnn::readNetFromTensorflow(weights, prototxt);
//VideoCapture cap("rtsp://admin:whxx2017@192.168.168.161:554/Streaming/Channels/001");
VideoCapture cap(0);

Mat frame;
while(1){
cap >> frame;
Size frame_size = frame.size();

Size cropSize;
if (frame_size.width / (float)frame_size.height > WHRatio)
{
    cropSize = Size(static_cast<int>(frame_size.height * WHRatio),
        frame_size.height);
}
else
{
    cropSize = Size(frame_size.width,
        static_cast<int>(frame_size.width / WHRatio));
}

Rect crop(Point((frame_size.width - cropSize.width) / 2,
    (frame_size.height - cropSize.height) / 2),
    cropSize);


cv::Mat blob = cv::dnn::blobFromImage(frame,1./255,Size(300,300));
//cout << "blob size: " << blob.size << endl;

net.setInput(blob);
Mat output = net.forward();
//cout << "output size: " << output.size << endl;

Mat detectionMat(output.size[2], output.size[3], CV_32F, output.ptr<float>());

frame = frame(crop);
float confidenceThreshold = 0.001;
for (int i = 0; i < detectionMat.rows; i++)
{
    float confidence = detectionMat.at<float>(i, 2);

    if (confidence > confidenceThreshold)
    {
        size_t objectClass = (size_t)(detectionMat.at<float>(i, 1));

        int xLeftBottom = static_cast<int>(detectionMat.at<float>(i, 3) * frame.cols);
        int yLeftBottom = static_cast<int>(detectionMat.at<float>(i, 4) * frame.rows);
        int xRightTop = static_cast<int>(detectionMat.at<float>(i, 5) * frame.cols);
        int yRightTop = static_cast<int>(detectionMat.at<float>(i, 6) * frame.rows);

        ostringstream ss;
        ss << confidence;
        String conf(ss.str());

        Rect object((int)xLeftBottom, (int)yLeftBottom,
            (int)(xRightTop - xLeftBottom),
            (int)(yRightTop - yLeftBottom));

        rectangle(frame, object, Scalar(0, 255, 0),2);
        String label = String(classNames[objectClass]) + ": " + conf;
        int baseLine = 0;
        Size labelSize = getTextSize(label, FONT_HERSHEY_SIMPLEX, 0.5, 1, &baseLine);
        rectangle(frame, Rect(Point(xLeftBottom, yLeftBottom - labelSize.height),
            Size(labelSize.width, labelSize.height + baseLine)),
            Scalar(0, 255, 0), CV_FILLED);
        putText(frame, label, Point(xLeftBottom, yLeftBottom),
            FONT_HERSHEY_SIMPLEX, 0.5, Scalar(0, 0, 0));
    }
}
namedWindow("image", CV_WINDOW_NORMAL);
imshow("image", frame);
waitKey(10);

}
return 0;
}

I output the bbox produced by tf python code and opencv c++ code.I found that the python tf code output bbox as following:

boxes: [[[0.48616847 0.32456273 0.5719043 0.36506736]
[0.48817295 0.39137262 0.57161564 0.4317271 ]
[0.17184295 0.31928855 0.27021247 0.36755735]
[0.17288518 0.38449278 0.27023977 0.43242595]
[0.49070644 0.45021996 0.5716888 0.48942813]
[0.48250633 0.26584074 0.5715775 0 ...
(more)
edit retag flag offensive close merge delete

Comments

Can you delete duplicate post please?

LBerger gravatar imageLBerger ( 2018-02-28 02:17:55 -0500 )edit
1

@LBerger , -- lots of duplicates, lately, no ?

berak gravatar imageberak ( 2018-02-28 02:20:04 -0500 )edit

@xbcreal, there is only C++ code in your question. Please add a python script too.

dkurt gravatar imagedkurt ( 2018-02-28 03:06:49 -0500 )edit

@dkurt hi i have put the python code in image format.I don't know how to put code in text format correctly.If you need a text python code,I can send a email to you.

xbcreal gravatar imagexbcreal ( 2018-02-28 03:24:42 -0500 )edit

Insert it your post (?) If it is is not possible you can write in a gist (on github) or pastebin..

LBerger gravatar imageLBerger ( 2018-02-28 04:13:10 -0500 )edit

@dkurt hi,do you know what's wrong with my problem?I'm really in a hurry,sorry to interupt you.

xbcreal gravatar imagexbcreal ( 2018-02-28 20:34:03 -0500 )edit

@xbcreal, Despite it has a prefix embedded is must be a similar MobileNetSSD model. Could you try to use blobFromImage arguments from an origin sample with correct means and scale?

dkurt gravatar imagedkurt ( 2018-02-28 23:08:16 -0500 )edit

@dkurt I ever tried this,it works well for my mobile-ssd model,but for the embedded version it cann't work.I guess maybe the .pbtxt text graph generated by tools is wrong.

xbcreal gravatar imagexbcreal ( 2018-02-28 23:14:38 -0500 )edit

@xbcreal, this way we need to know something about this .pbtxt, isn't it? Please add a reference on it or at least a command to generate it. Have you specified that input images have 256x256 sizes instead default 300x300? According to a config file.

dkurt gravatar imagedkurt ( 2018-03-01 02:09:09 -0500 )edit