Ask Your Question

Revision history [back]

click to hide/show revision 1
initial version

HAAR Cascade opencl and Nvidia GPU - Code question

Hi, I've been trying to get a performance lift in detector.detectMultiScale in opencv 3.2.

it appears that opencl HAAR cascade classifiers are only "supported" on AMD or intel opencl devices.

with a nvidia GPU and opencl enabled the code simply does not work. Here is the offending block in cascadedetect.cpp

line 606. It appears that if your device is not intel or AMD then the localSize of the featureEvaluator is set @ Size(0, 0)

if (ocl::haveOpenCL())
{
    if (ocl::Device::getDefault().isAMD() || ocl::Device::getDefault().isIntel())
    {
        localSize = Size(8, 8);
        lbufSize = Size(origWinSize.width + localSize.width,
                        origWinSize.height + localSize.height);
        if (lbufSize.area() > 1024)
            lbufSize = Size(0, 0);
   }
}

Why is this?

setting this Size(0, 0) means that the area() call later on returns 0 rendering use_ocl == false

     bool use_ocl = tryOpenCL && ocl::useOpenCL() &&
     OCL_FORCE_CHECK(_image.isUMat()) &&
     featureEvaluator->getLocalSize().area() > 0 &&
     (data.minNodesPerTree == data.maxNodesPerTree) &&
     !isOldFormatCascade() &&
     maskGenerator.empty() &&
     !outputRejectLevels;

I have tested removing the device vendor checks and it appears to function.

My question boils down to... Is there a genuine reason for not supporting OpenCL HAAR Cascades on Nvidia GPU's?

Since the cuda cascade classifier package is out of date and really doesn't function it seems that openCL is a viable candidate for a lift in detection performance.

There appears to be no comments in the code to say why only AMD and INTEL opencl devices are allowed here.

Please advise if I should take this to github.