Ask Your Question

Elliot_'s profile - activity

2017-05-26 13:30:54 -0600 asked a question How can I deal with cv::matchTemplate crashing for large images?

This problem only happens for UMat arguments, and only if us OpenCl is switched on as well.

I am guessing that the program is running out of GPU memory. This is apparently not supposed to be possible because the driver can do virtual memory. An alternative explanation is that TDR is the cause, but the computer doesn't freeze at any time so I don't think this is it.

Of course in the real program I will shrink the large images down before template matching, but since my program is for doing batches, I hope to call matchTemplate from a few windows threadpool threads to try and get higher throughput anyway. This also causes the crash. OpenCl gives us no way of getting the amount of free video memory, so I was thinking of making the number of parallel matchTemplate calls as a function of the total GPU memory amount.

The error comes out of clEnqueueReadbuffer, which returns -4 (CL_MEM_OBJECT_ALLOCATION_FAILURE); image description

Some code:

// OpenCvMyBuildTemplateMatchingTest.cpp : Defines the entry point for the console application.
//

#include "stdafx.h"
#include <opencv2/highgui.hpp>
#include <opencv2/imgproc.hpp>
#include <opencv2/core/ocl.hpp>

#include <iostream>
#include <chrono>
#include <conio.h>



using namespace std::chrono;


milliseconds DoMatTemplateMatching();
milliseconds DoUMatTemplateMatching();
void ContinuousUMatTemplateMatching();

cv::String PatchUri = "patch.bmp";
cv::String PictureUri = "notpatch.bmp";

int _numberOfTimesToRun = 32;
bool _useOpenCl = true;


int main()
{

    milliseconds _totalMatElapsed = milliseconds::zero();
    milliseconds _totalUMatElapsed = milliseconds::zero();

    cv::ocl::setUseOpenCL(_useOpenCl);

    for (int i = 0; i<_numberOfTimesToRun; i++)
    {
        _totalUMatElapsed += DoUMatTemplateMatching();
    }   
    std::cout << "\n";
    std::cout << "\nUMAT template matching took " << _totalUMatElapsed.count() << " milliseond for " <<         
_numberOfTimesToRun << " runs.\n";
    std::cout << "\n";

    for (int i = 0; i < _numberOfTimesToRun; i++)
    {
        _totalMatElapsed += DoMatTemplateMatching();
    }
    std::cout << "\n";
    std::cout << "\nMAT template matching took " << _totalMatElapsed.count() << " milliseond for " << 
_numberOfTimesToRun <<" runs.\n";   
    std::cout << "\n";


    cv::ocl::setUseOpenCL(false);

    for (int i = 0; i<_numberOfTimesToRun; i++)
    {
        _totalUMatElapsed += DoUMatTemplateMatching();
    }
    std::cout << "\n";
    std::cout << "\nUMAT without OpenCl template matching took " << _totalUMatElapsed.count() << " milliseond     
   for " << _numberOfTimesToRun << " runs.\n";
    std::cout << "\n";

    getch();
    return 0;
}



milliseconds DoMatTemplateMatching()
{
    cv::Mat _picture = cv::imread(PictureUri);
    cv::Mat _patch = cv::imread(PatchUri);

    cv::Mat _result;

    milliseconds _startTime = duration_cast< milliseconds >(system_clock::now().time_since_epoch());

    cv::matchTemplate(_picture, _patch, _result, 0);

    milliseconds _endTime = duration_cast< milliseconds >(system_clock::now().time_since_epoch());

    milliseconds _deltaTime = _endTime - _startTime;

    return _deltaTime;
}


milliseconds DoUMatTemplateMatching()
{


    cv::Mat _picture = cv::imread(PictureUri);
    cv::Mat _patch = cv::imread(PatchUri);


    milliseconds _startTime = duration_cast< milliseconds >(system_clock::now().time_since_epoch());


    /// need to convert to greyscale or else get that error:
    cv::cvtColor(_picture, _picture, CV_BGR2GRAY);
    cv::cvtColor(_patch, _patch, CV_BGR2GRAY);


    cv::UMat _uPicture = _picture.getUMat(cv::ACCESS_READ);
    cv::UMat _uPatch = _patch.getUMat(cv::ACCESS_READ);

    assert(_uPatch.type() == _uPicture.type());
    std::cout << _uPatch.type();


    cv::UMat _result;


    cv::matchTemplate(_uPicture, _uPatch, _result, CV_TM_SQDIFF);


    milliseconds _endTime = duration_cast< milliseconds >(system_clock::now().time_since_epoch());

    milliseconds _deltaTime = _endTime - _startTime;

    _picture.release();
    _patch.release();


    return _deltaTime;

}

void ContinuousUMatTemplateMatching()
{
    while (true)
    {
        std::cout << "\n" << DoUMatTemplateMatching().count() << "\n";
    }
}

The images I used in the above test: https://drive.google.com/file/d/0B_Ls... https://drive.google.com/file/d/0B_Ls...

Some system info:

Number of platforms ...
(more)
2017-05-25 09:33:12 -0600 answered a question GaussianBlur and Canny execution times are much longer on T-API

While investigating the same thing, I was looking in the code of cv::GaussianBlur(...), and it looks as though it only uses opencl when the kernel size is either 3x3 or 5x5. Can this be right?

image description

2017-05-15 11:10:52 -0600 received badge  Enthusiast