Writing custom OpenCL kernel for CV_32F matrices [closed]

asked 2017-03-15 09:00:31 -0500

iko79 gravatar image

updated 2017-03-15 09:28:49 -0500


I'm beginner in using TAPI for OpenCV and right now I'm trying to write a custom OpenCL kernel for output matrices of type CV_32F. I tried to isolate the problem in a small program but for some reason, I get a very different assert message there. Anyways, something is seriously wrong with my approach. This is my C++ code:

std::string code;

std::ifstream fileStream( "test.cl" );

size_t size = 0;

fileStream.seekg( 0, std::ios::end );
size = fileStream.tellg();
fileStream.seekg( 0, std::ios::beg );

code.resize( size );
fileStream.read( &code[0], size );


::cv::String errorMsg;

::cv::ocl::ProgramSource source( code );
::cv::ocl::Kernel kernel( "test", source, "", &errorMsg );

if( errorMsg.size() )
    throw std::runtime_error( errorMsg );

cv::UMat src( cv::Size( 640, 480 ), CV_8UC1 );
cv::UMat dst( cv::Size( 640, 480 ), CV_32FC1 );

//fill image...

cv::ocl::KernelArg srcarg = cv::ocl::KernelArg::ReadOnlyNoSize( src, src.channels() );
cv::ocl::KernelArg dstarg = cv::ocl::KernelArg::WriteOnly( dst, dst.channels() );

// pass 2 arrays to the kernel and run it
size_t globalThreads[3] = { (size_t) src.cols, (size_t) src.rows, 1 };
if( !kernel.args( srcarg, dstarg ).run( 2, globalThreads, NULL, false ) )
    std::cerr << "executing kernel failed" << std::endl;
    std::cout << "SUCCESS!" << std::endl;

    cv::imshow( "src", src );
    cv::imshow( "dst", dst );


My cl file looks like this:

__kernel void test(
        __global const uchar *srcptr, int src_step, int src_offset,
        __global float *dstptr, int dst_step, int dst_offset, int dst_rows, int dst_cols
    int x = get_global_id(0);
    int y = get_global_id(1);

    if (x < dst_cols && y < dst_rows)
        int src_index = mad24(y, src_step, mad24(x, sizeof(uchar), src_offset));
        int dst_index = mad24(y, dst_step, mad24(x, sizeof(float), dst_offset));

        __global const uchar *src = (__global const uchar*)(srcptr + src_index);
        __global float* dst = (__global float*)(dstptr + dst_index);

        dst[0] = 1.0f - src[0] / 255.0f;

The output this gives me is the following:


OpenCV Error: Assertion failed (clEnqueueReadBuffer(q, (cl_mem)u->handle, CL_TRUE, 0, u->size, alignedPtr.getAlignedPtr(), 0, 0, 0) == CL_SUCCESS) in cv::ocl::OpenCLAllocator::map, file C:\build\master_winpack-build-win64-vc14\opencv\modules\core\src\ocl.cpp, line 4773

OpenCV Error: Assertion failed (u->refcount == 0 && "UMat deallocation error: some derived Mat is still alive") in cv::ocl::OpenCLAllocator::deallocate, file C:\build\master_winpack-build-win64-vc14\opencv\modules\core\src\ocl.cpp, line 4528

When I change float to uchar in the CL code and CV_32FC1 to CV_8UC1, it works. I tried to orient on CL code of the OpenCV core, however I could not find any samples for writing float data in CL kernels in the OpenCL source code. Which is weird since you do have to deal with floating point data, don't you? But all your kernels, e.g. in opencl_kernels_core.cpp use uchar* desptr as an argument.

1) Why are there no float implementations?

2) How would I create a kernel writing float data?


edit retag flag offensive reopen merge delete

Closed for the following reason duplicate question by iko79
close date 2017-03-15 10:16:48.979497