Ask Your Question

Geoff McIver's profile - activity

2017-10-26 18:00:20 -0600 received badge  Enthusiast
2017-10-11 15:17:48 -0600 commented answer HAAR Cascade opencl and Nvidia GPU - Code question

I'm too new to answer my own question. :) It's seems a pull req on github was faster :)

2017-10-11 12:27:16 -0600 commented question HAAR Cascade opencl and Nvidia GPU - Code question

I've pushed a fix and the opencv team have merged it. Anyone who wants this speed up will have to build off master.

2017-10-11 12:26:44 -0600 commented question HAAR Cascade opencl and Nvidia GPU - Code question

I've pushed a fix and the opencv team have merged it.

2017-10-11 07:20:49 -0600 marked best answer HAAR Cascade opencl and Nvidia GPU - Code question

Hi, I've been trying to get a performance lift in detector.detectMultiScale in opencv 3.2.

it appears that opencl HAAR cascade classifiers are only "supported" on AMD or intel opencl devices.

with a nvidia GPU and opencl enabled the code simply does not work. Here is the offending block in cascadedetect.cpp

line 606. It appears that if your device is not intel or AMD then the localSize of the featureEvaluator is set @ Size(0, 0)

if (ocl::haveOpenCL())
{
    if (ocl::Device::getDefault().isAMD() || ocl::Device::getDefault().isIntel())
    {
        localSize = Size(8, 8);
        lbufSize = Size(origWinSize.width + localSize.width,
                        origWinSize.height + localSize.height);
        if (lbufSize.area() > 1024)
            lbufSize = Size(0, 0);
   }
}

Why is this?

setting this Size(0, 0) means that the area() call later on returns 0 rendering use_ocl == false

     bool use_ocl = tryOpenCL && ocl::useOpenCL() &&
     OCL_FORCE_CHECK(_image.isUMat()) &&
     featureEvaluator->getLocalSize().area() > 0 &&
     (data.minNodesPerTree == data.maxNodesPerTree) &&
     !isOldFormatCascade() &&
     maskGenerator.empty() &&
     !outputRejectLevels;

I have tested removing the device vendor checks and it appears to function.

My question boils down to... Is there a genuine reason for not supporting OpenCL HAAR Cascades on Nvidia GPU's?

Since the cuda cascade classifier package is out of date and really doesn't function it seems that openCL is a viable candidate for a lift in detection performance.

There appears to be no comments in the code to say why only AMD and INTEL opencl devices are allowed here.

Please advise if I should take this to github.

2017-10-10 04:33:22 -0600 commented question HAAR Cascade opencl and Nvidia GPU - Code question

I've looked at the changes and they don't seem relevant. I'll build master and test.

2017-10-10 04:33:13 -0600 commented question HAAR Cascade opencl and Nvidia GPU - Code question

I've looked at the changes and the don't seem relevant. I'll build master and test.

2017-10-10 04:12:28 -0600 commented question HAAR Cascade opencl and Nvidia GPU - Code question

Do you mean update to 3.3 or pull master from the repo?

2017-10-10 03:41:41 -0600 received badge  Student (source)
2017-10-10 03:40:18 -0600 asked a question HAAR Cascade opencl and Nvidia GPU - Code question

HAAR Cascade opencl and Nvidia GPU - Code question Hi, I've been trying to get a performance lift in detector.detectMult