HAAR Cascade opencl and Nvidia GPU - Code question
Hi, I've been trying to get a performance lift in detector.detectMultiScale
in opencv 3.2.
it appears that opencl HAAR cascade classifiers are only "supported" on AMD or intel opencl devices.
with a nvidia GPU and opencl enabled the code simply does not work. Here is the offending block in cascadedetect.cpp
line 606. It appears that if your device is not intel or AMD then the localSize of the featureEvaluator is set @ Size(0, 0)
if (ocl::haveOpenCL())
{
if (ocl::Device::getDefault().isAMD() || ocl::Device::getDefault().isIntel())
{
localSize = Size(8, 8);
lbufSize = Size(origWinSize.width + localSize.width,
origWinSize.height + localSize.height);
if (lbufSize.area() > 1024)
lbufSize = Size(0, 0);
}
}
Why is this?
setting this Size(0, 0) means that the area() call later on returns 0 rendering use_ocl == false
bool use_ocl = tryOpenCL && ocl::useOpenCL() &&
OCL_FORCE_CHECK(_image.isUMat()) &&
featureEvaluator->getLocalSize().area() > 0 &&
(data.minNodesPerTree == data.maxNodesPerTree) &&
!isOldFormatCascade() &&
maskGenerator.empty() &&
!outputRejectLevels;
I have tested removing the device vendor checks and it appears to function.
My question boils down to... Is there a genuine reason for not supporting OpenCL HAAR Cascades on Nvidia GPU's?
Since the cuda cascade classifier package is out of date and really doesn't function it seems that openCL is a viable candidate for a lift in detection performance.
There appears to be no comments in the code to say why only AMD and INTEL opencl devices are allowed here.
Please advise if I should take this to github.
may be you should update your code before posting an issue
Do you mean update to 3.3 or pull master from the repo?
pull master (some changes are recent)
I've looked at the changes and they don't seem relevant. I'll build master and test.
I've pushed a fix and the opencv team have merged it. Anyone who wants this speed up will have to build off master.