Ask Your Question

Revision history [back]

click to hide/show revision 1
initial version

DNN opencv with SSD resnet return wrong face dimension

Hello, I playing with face and DNN but I cannot figure out of to solve an issue.

I am processing image 256x256. Using deploy.prototxt and res10_300x300_ssd_iter_140000.caffemodel (same one on dnn directory).

Some code.

cv::Mat faceROI;
cv::Mat image;

image = cv::imread(imagePath[imageId], CV_LOAD_IMAGE_COLOR);
cv::Mat imageDNNBlob = cv::dnn::blobFromImage(image, 1.0, cv::Size(300, 300), 
    Scalar(104.0, 177.0, 123.0), false, false);
netOpenCVDNN.setInput(imageDNNBlob, "data");
cv::Mat detection = netOpenCVDNN.forward("detection_out");
cv::Mat faces(detection.size[2], detection.size[3], CV_32F, detection.ptr<float>());
for (int i = 0; i < faces.rows; i++)
{
    float confidence = faces.at<float>(i, 2);
    if (confidence > 0.99)
    {
        int xLeftBottom = static_cast<int>(faces.at<float>(i, 3) * image.cols);
        int yLeftBottom = static_cast<int>(faces.at<float>(i, 4) * image.rows);
        int xRightTop = static_cast<int>(faces.at<float>(i, 5) * image.cols);
        int yRightTop = static_cast<int>(faces.at<float>(i, 6) * image.rows);

        cv::Rect faceRect((int)xLeftBottom, (int)yLeftBottom, 
                    (int)(xRightTop - xLeftBottom), (int)(yRightTop - yLeftBottom));
    faceROI = cv::Mat(image, faceRect);
         }
 }

Nothing too exotic, I just write down what I found in resnet_ssd_face.cpp. When I try to extract ROI from image with faceROI = cv::Mat(image, faceRect) I get an error on wrong dimensions with faceRect, in fact (with a particular image) I get 257 as dimension (height). faces.at<float>(i, 6)return a float >1.

What I miss? Can some help to figure out?

I have also some questions about this example:

  1. netOpenCVDNN.forward return a Mat, where size[2] is the number of object found, size[3] numbers of property of each object? Am I right? Where can I find this more info about what forward return? (Already checked here and here. I think it is related to the layer "detection_out" of prototxt, but I can get it.
  2. Mat facesis a matrix with all faces found, right? Where each rows is a face detected and each rows (face) have some property (cols), right? So faces.at<float>(i, 2) is the confidence of i-th face and from 3 to 4 are dimensions of face. What position 0 and 1 contains?
  3. Why cv::Mat imageDNNBlob have a numbers of rows and cols like -1?
  4. Last one: I am using image of 256x256 dimension. Input layer of dnn use 300x300 as dimension. What is the right solution? Resize image? Change input layer? Is cv::Size(300, 300) right in blobFromImage?

Thanks in advance.

DNN opencv with SSD resnet return wrong face dimension

Hello, I playing with face and DNN but I cannot figure out of to solve an issue.

I am processing image 256x256. Using deploy.prototxt and res10_300x300_ssd_iter_140000.caffemodel (same one on dnn directory).

Some code.

cv::Mat faceROI;
cv::Mat image;

image = cv::imread(imagePath[imageId], CV_LOAD_IMAGE_COLOR);
cv::Mat imageDNNBlob = cv::dnn::blobFromImage(image, 1.0, cv::Size(300, 300), 
    Scalar(104.0, 177.0, 123.0), false, false);
netOpenCVDNN.setInput(imageDNNBlob, "data");
cv::Mat detection = netOpenCVDNN.forward("detection_out");
cv::Mat faces(detection.size[2], detection.size[3], CV_32F, detection.ptr<float>());
for (int i = 0; i < faces.rows; i++)
{
    float confidence = faces.at<float>(i, 2);
    if (confidence > 0.99)
    {
        int xLeftBottom = static_cast<int>(faces.at<float>(i, 3) * image.cols);
        int yLeftBottom = static_cast<int>(faces.at<float>(i, 4) * image.rows);
        int xRightTop = static_cast<int>(faces.at<float>(i, 5) * image.cols);
        int yRightTop = static_cast<int>(faces.at<float>(i, 6) * image.rows);

        cv::Rect faceRect((int)xLeftBottom, (int)yLeftBottom, 
                    (int)(xRightTop - xLeftBottom), (int)(yRightTop - yLeftBottom));
    faceROI = cv::Mat(image, faceRect);
         }
 }

Nothing too exotic, I just write down what I found in resnet_ssd_face.cpp. When I try to extract ROI from image with faceROI = cv::Mat(image, faceRect) I get an error on wrong dimensions with faceRect, in fact (with a particular image) I get 257 as dimension (height). faces.at<float>(i, 6)return a float >1.

What I miss? Can some help to figure out?

I have also some questions about this example:

  1. netOpenCVDNN.forward return a Mat, where size[2] is the number of object found, size[3] numbers of property of each object? Am I right? Where can I find this more info about what forward return? (Already checked here and here. I think it is related to the layer "detection_out" of prototxt, but I can not get it.it).
  2. Mat facesis a matrix with all faces found, right? Where each rows is a face detected and each rows (face) have some property (cols), right? So faces.at<float>(i, 2) 2) is the confidence of i-th face and from 3 to 4 are dimensions of face. What position 0 and 1 contains?
  3. Why cv::Mat imageDNNBlob have a numbers of rows and cols like -1?
  4. Last one: I am using image of 256x256 dimension. Input layer of dnn use 300x300 as dimension. What is the right solution? Resize image? Change input layer? Is cv::Size(300, 300) right in blobFromImage?

Thanks in advance.