Revision history - OpenCV Q&A Forum

Writing custom OpenCL kernel for CV_32F matrices

Hi,

I'm beginner in using TAPI for OpenCV and right now I'm trying to write a custom OpenCL kernel for output matrices of type CV_32F. I tried to isolate the problem in a small program but for some reason, I get a very different assert message there. Anyways, something is seriously wrong with my approach. This is my C++ code:

std::string code;

std::ifstream fileStream( "test.cl" );

size_t size = 0;

fileStream.seekg( 0, std::ios::end );
size = fileStream.tellg();
fileStream.seekg( 0, std::ios::beg );

code.resize( size );
fileStream.read( &code[0], size );

fileStream.close();

::cv::String errorMsg;

::cv::ocl::ProgramSource source( code );
::cv::ocl::Kernel kernel( "test", source, "", &errorMsg );

if( errorMsg.size() )
    throw std::runtime_error( errorMsg );

cv::UMat src( cv::Size( 640, 480 ), CV_8UC1 );
cv::UMat dst( cv::Size( 640, 480 ), CV_32FC1 );

//fill image...

cv::ocl::KernelArg srcarg = cv::ocl::KernelArg::ReadOnlyNoSize( src, src.channels() );
cv::ocl::KernelArg dstarg = cv::ocl::KernelArg::WriteOnly( dst, dst.channels() );

// pass 2 arrays to the kernel and run it
size_t globalThreads[3] = { (size_t) src.cols, (size_t) src.rows, 1 };
if( !kernel.args( srcarg, dstarg ).run( 2, globalThreads, NULL, false ) )
    std::cerr << "executing kernel failed" << std::endl;
else
{
    std::cout << "SUCCESS!" << std::endl;

    cv::imshow( "src", src );
    cv::imshow( "dst", dst );

    cv::waitKey();
}

My cl file looks like this:

__kernel void test(
        __global const uchar *srcptr, int src_step, int src_offset,
        __global float *dstptr, int dst_step, int dst_offset, int dst_rows, int dst_cols
    )
{
    int x = get_global_id(0);
    int y = get_global_id(1);

    if (x < dst_cols && y < dst_rows)
    {
        int src_index = mad24(y, src_step, mad24(x, sizeof(uchar), src_offset));
        int dst_index = mad24(y, dst_step, mad24(x, sizeof(float), dst_offset));

        __global const uchar *src = (__global const uchar*)(srcptr + src_index);
        __global float* dst = (__global float*)(dstptr + dst_index);

        dst[0] = 1.0f - src[0] / 255.0f;
    }
}

When I change float to uchar in the CL code and CV_32FC1 to CV_8UC1, it works. I tried to orient on CL code of the OpenCV core, however I could not find any samples for writing float data in CL kernels in the OpenCL source code. Which is weird since you do have to deal with floating point data, don't you? But all your kernels, e.g. in opencl_kernels_core.cpp use uchar* desptr as an argument.

1) Why are there no float implementations? 2) How would I create a kernel writing float data?

Thanks!

Writing custom OpenCL kernel for CV_32F matrices

Hi,

I'm beginner in using TAPI for OpenCV and right now I'm trying to write a custom OpenCL kernel for output matrices of type CV_32F. I tried to isolate the problem in a small program but for some reason, I get a very different assert message there. Anyways, something is seriously wrong with my approach. This is my C++ code:

std::string code;

std::ifstream fileStream( "test.cl" );

size_t size = 0;

fileStream.seekg( 0, std::ios::end );
size = fileStream.tellg();
fileStream.seekg( 0, std::ios::beg );

code.resize( size );
fileStream.read( &code[0], size );

fileStream.close();

::cv::String errorMsg;

::cv::ocl::ProgramSource source( code );
::cv::ocl::Kernel kernel( "test", source, "", &errorMsg );

if( errorMsg.size() )
    throw std::runtime_error( errorMsg );

cv::UMat src( cv::Size( 640, 480 ), CV_8UC1 );
cv::UMat dst( cv::Size( 640, 480 ), CV_32FC1 );

//fill image...

cv::ocl::KernelArg srcarg = cv::ocl::KernelArg::ReadOnlyNoSize( src, src.channels() );
cv::ocl::KernelArg dstarg = cv::ocl::KernelArg::WriteOnly( dst, dst.channels() );

// pass 2 arrays to the kernel and run it
size_t globalThreads[3] = { (size_t) src.cols, (size_t) src.rows, 1 };
if( !kernel.args( srcarg, dstarg ).run( 2, globalThreads, NULL, false ) )
    std::cerr << "executing kernel failed" << std::endl;
else
{
    std::cout << "SUCCESS!" << std::endl;

    cv::imshow( "src", src );
    cv::imshow( "dst", dst );

    cv::waitKey();
}

My cl file looks like this:

__kernel void test(
        __global const uchar *srcptr, int src_step, int src_offset,
        __global float *dstptr, int dst_step, int dst_offset, int dst_rows, int dst_cols
    )
{
    int x = get_global_id(0);
    int y = get_global_id(1);

    if (x < dst_cols && y < dst_rows)
    {
        int src_index = mad24(y, src_step, mad24(x, sizeof(uchar), src_offset));
        int dst_index = mad24(y, dst_step, mad24(x, sizeof(float), dst_offset));

        __global const uchar *src = (__global const uchar*)(srcptr + src_index);
        __global float* dst = (__global float*)(dstptr + dst_index);

        dst[0] = 1.0f - src[0] / 255.0f;
    }
}

The output this gives me is the following:

SUCCESS!

OpenCV Error: Assertion failed (clEnqueueReadBuffer(q, (cl_mem)u->handle, CL_TRU E, 0, u->size, alignedPtr.getAlignedPtr(), 0, 0, 0) == CL_SUCCESS) in cv::ocl::O penCLAllocator::map, file C:\build\master_winpack-build-win64-vc14\opencv\module s\core\src\ocl.cpp, line 4773 OpenCV Error: Assertion failed (u->refcount == 0 && "UMat deallocation error: so me derived Mat is still alive") in cv::ocl::OpenCLAllocator::deallocate, file C: \build\master_winpack-build-win64-vc14\opencv\modules\core\src\ocl.cpp, line 452 8

When I change float to uchar in the CL code and CV_32FC1 to CV_8UC1, it works. I tried to orient on CL code of the OpenCV core, however I could not find any samples for writing float data in CL kernels in the OpenCL source code. Which is weird since you do have to deal with floating point data, don't you? But all your kernels, e.g. in opencl_kernels_core.cpp use uchar* desptr as an argument.

1) Why are there no float ~~implementations?~~ implementations?

2) How would I create a kernel writing float data?

Thanks!

Writing custom OpenCL kernel for CV_32F matrices

Hi,

I'm beginner in using TAPI for OpenCV and right now I'm trying to write a custom OpenCL kernel for output matrices of type CV_32F. I tried to isolate the problem in a small program but for some reason, I get a very different assert message there. Anyways, something is seriously wrong with my approach. This is my C++ code:

std::string code;

std::ifstream fileStream( "test.cl" );

size_t size = 0;

fileStream.seekg( 0, std::ios::end );
size = fileStream.tellg();
fileStream.seekg( 0, std::ios::beg );

code.resize( size );
fileStream.read( &code[0], size );

fileStream.close();

::cv::String errorMsg;

::cv::ocl::ProgramSource source( code );
::cv::ocl::Kernel kernel( "test", source, "", &errorMsg );

if( errorMsg.size() )
    throw std::runtime_error( errorMsg );

cv::UMat src( cv::Size( 640, 480 ), CV_8UC1 );
cv::UMat dst( cv::Size( 640, 480 ), CV_32FC1 );

//fill image...

cv::ocl::KernelArg srcarg = cv::ocl::KernelArg::ReadOnlyNoSize( src, src.channels() );
cv::ocl::KernelArg dstarg = cv::ocl::KernelArg::WriteOnly( dst, dst.channels() );

// pass 2 arrays to the kernel and run it
size_t globalThreads[3] = { (size_t) src.cols, (size_t) src.rows, 1 };
if( !kernel.args( srcarg, dstarg ).run( 2, globalThreads, NULL, false ) )
    std::cerr << "executing kernel failed" << std::endl;
else
{
    std::cout << "SUCCESS!" << std::endl;

    cv::imshow( "src", src );
    cv::imshow( "dst", dst );

    cv::waitKey();
}

My cl file looks like this:

__kernel void test(
        __global const uchar *srcptr, int src_step, int src_offset,
        __global float *dstptr, int dst_step, int dst_offset, int dst_rows, int dst_cols
    )
{
    int x = get_global_id(0);
    int y = get_global_id(1);

    if (x < dst_cols && y < dst_rows)
    {
        int src_index = mad24(y, src_step, mad24(x, sizeof(uchar), src_offset));
        int dst_index = mad24(y, dst_step, mad24(x, sizeof(float), dst_offset));

        __global const uchar *src = (__global const uchar*)(srcptr + src_index);
        __global float* dst = (__global float*)(dstptr + dst_index);

        dst[0] = 1.0f - src[0] / 255.0f;
    }
}

The output this gives me is the following:

SUCCESS!

OpenCV Error: Assertion failed (clEnqueueReadBuffer(q, (cl_mem)u->handle, ~~CL_TRU E,~~ CL_TRUE, 0, u->size, alignedPtr.getAlignedPtr(), 0, 0, 0) == CL_SUCCESS) in ~~cv::ocl::O penCLAllocator::map,~~ cv::ocl::OpenCLAllocator::map, file ~~C:\build\master_winpack-build-win64-vc14\opencv\module s\core\src\ocl.cpp,~~ C:\build\master_winpack-build-win64-vc14\opencv\modules\core\src\ocl.cpp, line ~~4773~~ 4773

OpenCV Error: Assertion failed (u->refcount == 0 && "UMat deallocation error: ~~so me~~ some derived Mat is still alive") in cv::ocl::OpenCLAllocator::deallocate, file ~~C: \build\master_winpack-build-win64-vc14\opencv\modules\core\src\ocl.cpp,~~ C:\build\master_winpack-build-win64-vc14\opencv\modules\core\src\ocl.cpp, line ~~452 8~~4528

When I change float to uchar in the CL code and CV_32FC1 to CV_8UC1, it works. I tried to orient on CL code of the OpenCV core, however I could not find any samples for writing float data in CL kernels in the OpenCL source code. Which is weird since you do have to deal with floating point data, don't you? But all your kernels, e.g. in opencl_kernels_core.cpp use uchar* desptr as an argument.

1) Why are there no float implementations?

2) How would I create a kernel writing float data?

Thanks!

Revision history [back]

Writing custom OpenCL kernel for CV_32F matrices

Writing custom OpenCL kernel for CV_32F matrices

Writing custom OpenCL kernel for CV_32F matrices