Software running significantly slower with OpenCL (Opencv 3.0.0)
I've recently moved to OpenCV 3.0.0, and I am looking into OpenCL potential to optimize software. For that I wrote a simple face detection software that detects and matches faces between frames. For this, I use a xml Cascade Classifier I found online with the detectMultiscale method, and a Hungarian Algorithm to match rectangles.
Code:
_timer.start
while (cv::waitKey(1) != 27)
{
//Get frame from webcam
_cam.getFrame(_frame);
//Resize from 640x480 to 320x240
if (_width != _frame.cols)
{
cv::resize(_frame, _frame, cv::Size(_width, _height));
}
//Calls detectMultiScale and matching algorithm
_faceDetector.update(_frame);
_faceDetector.getFaceVector(_faceVector);
//Display image and face rectangles
for (unsigned int n = 0; n < _faceVector.size(); n++)
{
if ( _faceVector[n]->getConfidenceFactor() > 5)
{
std::stringstream ss, ss2;
ss << _faceVector[n]->getID();
ss2 << _faceVector[n]->getConfidenceFactor();
cv::putText(_frame, ss.str(), cv::Point(_faceVector[n]->getBoundingBox().x, _faceVector[n]->getBoundingBox().y), cv::FONT_HERSHEY_SIMPLEX, 1, cv::Scalar(255, 0, 255), 2, 8, false);
cv::putText(_frame, ss2.str(), cv::Point(_faceVector[n]->getBoundingBox().x, _faceVector[n]->getBoundingBox().y +_faceVector[n]->getBoundingBox().height), cv::FONT_HERSHEY_SIMPLEX, 1, cv::Scalar(255, 255, 0), 2, 8, false);
cv::rectangle(_frame, _faceVector[n]->getBoundingBox(), CV_RGB(0,255,0), 1, 8, 0);
}
}
//Calculate frames per second
_counterFps++;
if (_timer.getElapsedTime() >= 1)
{
_timer.reset();
std::cout << "FPS: " << _counterFps << std::endl;
_counterFps = 0;
}
}
I turn on and off ocl with the following call on system initialization:
cv::ocl::setUseOpenCL(bool)
System characteristics:
Processor: Intel core 2 duo
GPU: GeForce GT220
Operating System: Linux (gentoo)
Results:
If I set useOpenCL to false, this code runs at 30 fps and uses 98% of the CPU (got this from looking at top) If I set useOpenCL to true, the code runs at 24 fps and uses 198% of CPU.
EDIT1: Let me clarify that this difference isn't noticeable until the detectMultiScale method is called. Capturing frame from webcam and resizing have the same cost. The visualization block also has the same cost with or without OpenCL. I used a timer to verify that the detectMultiScale method takes longer with openCL flag active when compared to no OpenCL flag.
I also monitor the GPU temperature, and it rises significantly when OpenCL is being used, so it confirms that is working. If this is the case, how come that the CPU is working twice as much? It doesn't make any sense.
EDIT2: I tried to test this with other algorithms, such as Sobel (which is documented to be up to 32x faster with OpenCL here, but, this time, there is no change in the CPU processing, meaning, it works the same with or without openCL. Also used a timer to verify that it takes the same time to process edges.
I need help interpreting these results, since I was really hoping for a good boost on performance.
Best regards.
Did you comment out some functions for testing?
cv::resize
for example orcv::putText
. Not all functions are supported by OpenCL. I can imagine that OpenCV will copy data back to CPU memory in this case. That would explain the CPU utilization. But its only a consideration.You can also measure time of a single function, perhabs you can locate the problem.
I updated my question .
@Pedro Batista can you upload the code with the sobel test, then I will try to run it on my machine in order to double check and see if the same happens here as well ;-)
@theodore, please go to my newer question http://answers.opencv.org/question/58...