Software running significantly slower with OpenCL (Opencv 3.0.0)

asked 2015-03-23 09:25:48 -0500

updated 2015-03-24 08:13:31 -0500

I've recently moved to OpenCV 3.0.0, and I am looking into OpenCL potential to optimize software. For that I wrote a simple face detection software that detects and matches faces between frames. For this, I use a xml Cascade Classifier I found online with the detectMultiscale method, and a Hungarian Algorithm to match rectangles.


    while (cv::waitKey(1) != 27)

//Get frame from webcam

//Resize from 640x480 to 320x240
        if (_width != _frame.cols)
            cv::resize(_frame, _frame, cv::Size(_width, _height));

//Calls detectMultiScale and matching algorithm

//Display image and face rectangles
        for (unsigned int n = 0; n < _faceVector.size(); n++)
            if ( _faceVector[n]->getConfidenceFactor() > 5)
                std::stringstream ss, ss2;
                ss << _faceVector[n]->getID();
                ss2 << _faceVector[n]->getConfidenceFactor();
                cv::putText(_frame, ss.str(), cv::Point(_faceVector[n]->getBoundingBox().x, _faceVector[n]->getBoundingBox().y), cv::FONT_HERSHEY_SIMPLEX, 1, cv::Scalar(255, 0, 255), 2, 8, false);
                cv::putText(_frame, ss2.str(), cv::Point(_faceVector[n]->getBoundingBox().x, _faceVector[n]->getBoundingBox().y +_faceVector[n]->getBoundingBox().height), cv::FONT_HERSHEY_SIMPLEX, 1, cv::Scalar(255, 255, 0), 2, 8, false);
                cv::rectangle(_frame, _faceVector[n]->getBoundingBox(), CV_RGB(0,255,0), 1, 8, 0);
//Calculate frames per second

        if (_timer.getElapsedTime() >= 1)
            std::cout << "FPS: " << _counterFps << std::endl;
            _counterFps = 0;

I turn on and off ocl with the following call on system initialization:


System characteristics:

Processor: Intel core 2 duo

GPU: GeForce GT220

Operating System: Linux (gentoo)


If I set useOpenCL to false, this code runs at 30 fps and uses 98% of the CPU (got this from looking at top) If I set useOpenCL to true, the code runs at 24 fps and uses 198% of CPU.

EDIT1: Let me clarify that this difference isn't noticeable until the detectMultiScale method is called. Capturing frame from webcam and resizing have the same cost. The visualization block also has the same cost with or without OpenCL. I used a timer to verify that the detectMultiScale method takes longer with openCL flag active when compared to no OpenCL flag.

I also monitor the GPU temperature, and it rises significantly when OpenCL is being used, so it confirms that is working. If this is the case, how come that the CPU is working twice as much? It doesn't make any sense.

EDIT2: I tried to test this with other algorithms, such as Sobel (which is documented to be up to 32x faster with OpenCL here, but, this time, there is no change in the CPU processing, meaning, it works the same with or without openCL. Also used a timer to verify that it takes the same time to process edges.

I need help interpreting these results, since I was really hoping for a good boost on performance.

Best regards.

image description

edit retag flag offensive close merge delete


Did you comment out some functions for testing? cv::resize for example or cv::putText. Not all functions are supported by OpenCL. I can imagine that OpenCV will copy data back to CPU memory in this case. That would explain the CPU utilization. But its only a consideration.

You can also measure time of a single function, perhabs you can locate the problem.

matman gravatar imagematman ( 2015-03-23 13:22:28 -0500 )edit

I updated my question .

Pedro Batista gravatar imagePedro Batista ( 2015-03-24 05:23:00 -0500 )edit

@Pedro Batista can you upload the code with the sobel test, then I will try to run it on my machine in order to double check and see if the same happens here as well ;-)

theodore gravatar imagetheodore ( 2015-03-24 12:03:14 -0500 )edit

@theodore, please go to my newer question

Pedro Batista gravatar imagePedro Batista ( 2015-03-25 09:33:17 -0500 )edit