OpenCV 3.0 ,the performance of UMat

answered 2015-06-07 16:36:43 -0600

Eduardo
3589 ●1 ●15 ●41

Hi,

To use OpenCL, I use in addition to cv::UMat:

cv::ocl::setUseOpenCL(true);
I add an environment variable to set the correct GPU device (see the documentation) as I have an integrated GPU (Intel HD Graphics) and a dedicated GPU: name of the variable: OPENCV_OPENCL_DEVICE ; value of the variable: :GPU:1

Some tests I did for CascadeClassifier::detectMultiScale() using OpenCV-3.0.0-rc1, Windows 7 x64, VS2010 in release mode, image size=1280x720, results on an average of 1000 images:

Only the CPU (Intel Core i7): 12.46 FPS, CPU load: 65%
OpenCL + Intel HD Graphics: 7 FPS, CPU load: 8%, GPU load: 78%, (x0.56)
OpenCL + GPU (nVidia): 13 FPS, CPU load: 25%, GPU load: 70%, (x1.04)
CUDA + GPU: 30 FPS, CPU load: 12%, GPU load: 60%, (x2,4)

On my computer, the gain for OpenCL + GPU is negligible compared to using only the CPU. However, with CUDA + GPU the speed-up is about x2. I did't check if the results are the same for all the version of detectMultiScale.

The code I used for my tests, feel free to add your results to disprove/confirm my results:

#include <iostream>

#include <opencv2/opencv.hpp>
#include <opencv2/core/ocl.hpp>
#include <opencv2/core/cuda.hpp>
#include <opencv2/cudaobjdetect.hpp>
#include <opencv2/cudaimgproc.hpp>


int main(int argc, char**argv) {
    std::cout << "OpenCV version=" << std::hex << CV_VERSION << std::dec << std::endl;

    cv::Mat frame;
    cv::UMat uframe, uFrameGray;
    cv::cuda::GpuMat image_gpu, image_gpu_gray;
    cv::VideoCapture capture("path_to_the_video");

    bool useOpenCL = (argc >= 2) ? atoi(argv[1]) : false;
    std::cout << "Use OpenCL=" << useOpenCL << std::endl;
    cv::ocl::setUseOpenCL(useOpenCL);

    bool useCuda = (argc >= 3) ? atoi(argv[2]) : false;
    std::cout << "Use CUDA=" << useCuda << std::endl;

    cv::Ptr<cv::CascadeClassifier> cascade = cv::makePtr<cv::CascadeClassifier>("data/lbpcascades/lbpcascade_frontalface.xml");
    cv::Ptr<cv::cuda::CascadeClassifier> cascade_gpu = cv::cuda::CascadeClassifier::create("data/lbpcascades/lbpcascade_frontalface.xml");

    double time = 0.0;
    int nb = 0;
    if(capture.isOpened()) {
        for(;;) {
            capture >> frame;
            if(frame.empty() || nb >= 1000) {
                break;
            }

            std::vector<cv::Rect> faces;
            double t = 0.0;
            if(!useCuda) {
                t = (double) cv::getTickCount();
                frame.copyTo(uframe);
                cv::cvtColor(uframe, uFrameGray, CV_BGR2GRAY);
                cascade->detectMultiScale(uFrameGray, faces);
                t = ((double) cv::getTickCount() - t) / cv::getTickFrequency();
            } else {
                t = (double) cv::getTickCount();
                image_gpu.upload(frame);
                cv::cuda::cvtColor(image_gpu, image_gpu_gray, CV_BGR2GRAY);
                cv::cuda::GpuMat objbuf;
                cascade_gpu->detectMultiScale(image_gpu_gray, objbuf);
                cascade_gpu->convert(objbuf, faces);
                t = ((double) cv::getTickCount() - t) / cv::getTickFrequency();
            }

            time += t;
            nb++;

            for(std::vector<cv::Rect>::const_iterator it = faces.begin(); it != faces.end(); ++it) {
                cv::rectangle(frame, *it, cv::Scalar(0,0,255));
            }
            std::stringstream ss;
            ss << "FPS=" << (nb / time);
            cv::putText(frame, ss.str(), cv::Point(30, 30), cv::FONT_HERSHEY_SIMPLEX, 1.0, cv::Scalar(0,0,255));

            cv::imshow("Frame", frame);
            char c = cv::waitKey(30);
            if(c == 27) {
                break;
            }
        }
    }

    std::cout << "Mean time=" << (time / nb) << " s" << " ; Mean FPS=" << (nb / time) << " ; nb=" << nb << std::endl;
    system("pause");
    return 0;
}

edit flag offensive delete link

Comments

I've done some tests refer to http://answers.opencv.org/question/58.... For the filtering and Soble cases, I have got some good results. But the results of the facedetect demo are still bad. The GPU has seemingly not started up, because the GPU load(tested by GPU-Z) is 1%~2%. And I have not check out the reason until now.

Anna Lucia ( 2015-06-09 04:35:13 -0600 )edit

Did you check that setOpenCV is correctly set to true ?

cv::ocl::setUseOpenCL(true);

Eduardo ( 2015-06-09 09:02:34 -0600 )edit

Yes, I've set the flag as true. But the result is the same as false when I use the haarcascade to do the facedetect. Then I change the cascade file as lbpcascade, and get a double FPS.

Anna Lucia ( 2015-06-09 19:48:28 -0600 )edit

Actually with the new Tapi, the cv::ocl module should be completely gone. So that will not be the problem. Using a UMat should invoke the setUseOpenCL(true) implicitly once an OpenCL enabled device is detected.

StevenPuttemans ( 2015-06-12 04:52:05 -0600 )edit

@Anna Lucia, keep in mind that the progress report of the latest 3.0 release states that several 100 of functions have been updated using the Tapi interface, but it is possible that the facedetection is not there yet. Then using a UMat will indeed be slower, because it invokes tons of unneccesary checks for OpenCL possibilities. I would open a bug report if I was you.

StevenPuttemans ( 2015-06-12 04:53:31 -0600 )edit

Hi, I meet the same problem. I tested UMat with cv::ocl::goodFeaturesToTrack in three conditions: UMat/setUseOpenCL(true), UMat/setUseOpenCL(false) and only using cv::Mat. In debug mode, the first situation runs much quicker; but in release mode, they almost have the same runtime. I checked the source code of cv::ocl::goodFeaturesToTrack and it did have a OpenCL kernel. So it should have been updated with T-API. BTY, I also test the some function with the OpenCV2.4.11's ocl module. It runs a little faster than that with UMat. I'am considering turning back to OpenCV2.4.11 :(

Lenoir.Tan ( 2015-06-13 04:07:28 -0600 )edit

Did you ever get an answer to how to get a speedup from using goodFeaturesToTrack with OpenCL? I'm trying to use it on a Mac.

mself ( 2016-06-06 12:24:14 -0600 )edit

add a comment

OpenCV 3.0 ,the performance of UMat

1 answer

Comments

Links

Question Tools

Stats

Related questions

OpenCV 3.0 ,the performance of UMat edit

1 answer

Comments

Links

Question Tools

Stats

Related questions

OpenCV 3.0 ,the performance of UMat