Ask Your Question

iko79's profile - activity

2019-07-17 15:42:53 -0600 received badge  Popular Question (source)
2018-08-08 03:09:31 -0600 received badge  Famous Question (source)
2018-07-17 06:31:25 -0600 received badge  Nice Answer (source)
2017-08-25 14:26:00 -0600 received badge  Notable Question (source)
2017-03-29 02:47:09 -0600 asked a question OpenCL/OpenGL interop using UMat

Hi,

I'm currently facing the challenge of transferring images from OpenCV with OpenCL towards a CUDA surface. As it seems like there is no way of OpenCL/CUDA interopability, my idea is to use OpenGL as a detour, as there are severals ways of OpenCL/OpenGL interop as well as CUDA/OpenGL interop.

As my starting point is the image data of a cv::UMat I am a bit stuck since all three ways of OpenCL/OpenGL interop I could find seem to require a preexisting OpenGL texture or buffer, which the OpenCL buffers are created from. Using OpenCV, I've got things actually the other way around. The cv::UMat creates the OpenCL objects for me and I don't seem to have any influence in that. It seems like I cannot even create a cv::UMat as a wrapper for a cv_mem pointer I am managing by myself (similar to using user-managed data in a cv::Mat).

Do you see any reasonable possibility in doing what I try to do (except of touching the OpenCV core code)?

2017-03-29 02:30:24 -0600 commented question OpenCL Kernels not cached in OpenCL 3.2?

You're right, when I look at the timing it seems to do what it's supposed to. I am still confused about 1) what a Kernel ID is, 2) why the kernel gets a new one assigned in the OpenCV implementation, and 3) whether or not this could be a potential problem.

2017-03-23 04:46:53 -0600 asked a question OpenCL Kernels not cached in OpenCL 3.2?

Hi,

I noticed something that's ringing my alarm bells: In the OpenCV 3.0 overview slides you state that the OpenCL "kernel is compiled only once and cached". However, when I use the suggested code like this and use Nsight for Visual Studio as a profiler, I can see that every call of a processing method/function with OpenCL support seems to recreate a new OpenCL Kernel object since every call is associated with a new Kernel ID. E.g. if I run this code here...

cv::UMat m1;
cv::UMat m2;
cv::Mat kernel = cv::getStructuringElement( cv::MORPH_ELLIPSE, cv::Size( 5, 5 ), cv::Point( 3, 3 ) );

cv::imread( "frame.tiff" ).copyTo( m1 );

for( int i = 0; i < 10; i++ )
    cv::morphologyEx( m1, m1, cv::MORPH_DILATE, kernel );

...I can see ten calls of "morph" like this:

Kernel ID   Creation Time (μs)  Lifetime (μs)   Kernel Name Program ID  Count   
1           144,977.128         2,639,412.861   morph       1           1       
2           149,341.788         2,635,090.195   morph       1           1       
3           150,546.564         2,633,925.752   morph       1           1       
4           151,932.220         2,632,580.169   morph       1           1       
5           153,527.674         2,631,024.282   morph       1           1       
6           155,105.123         2,629,485.587   morph       1           1       
7           156,673.963         2,627,955.218   morph       1           1       
8           158,268.724         2,626,399.276   morph       1           1       
9           160,592.615         2,624,114.423   morph       1           1       
10          161,995.603         2,622,750.544   morph       1           1

That said, I didn't step into the OpenCV code with the debugger, because right now I don't got time for this, and I also didn't compare to OpenCV 3.0 source code. I also am by no means an expert in OpenCL and am not entirely sure what a Kernel ID is and at what point it's issued, so I can only speculate on what's going on, but to me it seems like Kernel objects are created over and over again and I'm pretty sure this isn't good.

Are the slides outdated? Is this a bug? It certainly doesn't seem like it's meant that way.

2017-03-17 04:11:12 -0600 commented question TAPI -> no performance gain

idk, like I said cv::ocl::haveOpenCL() tells me it was built with OpenCL support. Also, I'm using some custom CL code with cv::ocl::Kernel which definitely is invoked. In the OpenCV source code I saw that, if built with HAVE_OPENCL, the functions try to run CL code using the CV_OCL_RUN macro, however there are a few conditions checked beforehand, otherwise it falls back to CPU. It does not seem like I have any possitility to figure out if the GPU or the CPU was actually used but stepping into each and every OpenCL function with the debugger, am I right?

2017-03-16 11:11:26 -0600 asked a question TAPI -> no performance gain

TAPI absolute beginner here. I ported my CV code to make use of UMat intstead of Mat since my CPU was on it's limit, especially morphologic operations seemed to consume quite some computing power.

I cannot see any changes in my framerate, it is exactly the same no matter if I use TAPI or not, also Process Explorer reports no GPU usage whatsoever. I did a small test with a few calls of dilation and closing on a full HD image -- no effect.

Did I miss something I have to do in order to enable TAPI? I'm using the latest OpenCV 3.2 build for Windows and a GTX 980 with driver 378.49. cv::ocl::haveOpenCL() and cv::ocl::useOpenCL() both return true and cv::ocl::Context::getDefault().device( 0 ) also gives me the correct device, everything looks good.

Any ideas? Are there any how-to's, best practices or common pitfalls for using TAPI?

2017-03-15 09:00:31 -0600 asked a question Writing custom OpenCL kernel for CV_32F matrices

Hi,

I'm beginner in using TAPI for OpenCV and right now I'm trying to write a custom OpenCL kernel for output matrices of type CV_32F. I tried to isolate the problem in a small program but for some reason, I get a very different assert message there. Anyways, something is seriously wrong with my approach. This is my C++ code:

std::string code;

std::ifstream fileStream( "test.cl" );

size_t size = 0;

fileStream.seekg( 0, std::ios::end );
size = fileStream.tellg();
fileStream.seekg( 0, std::ios::beg );

code.resize( size );
fileStream.read( &code[0], size );

fileStream.close();

::cv::String errorMsg;

::cv::ocl::ProgramSource source( code );
::cv::ocl::Kernel kernel( "test", source, "", &errorMsg );

if( errorMsg.size() )
    throw std::runtime_error( errorMsg );

cv::UMat src( cv::Size( 640, 480 ), CV_8UC1 );
cv::UMat dst( cv::Size( 640, 480 ), CV_32FC1 );

//fill image...

cv::ocl::KernelArg srcarg = cv::ocl::KernelArg::ReadOnlyNoSize( src, src.channels() );
cv::ocl::KernelArg dstarg = cv::ocl::KernelArg::WriteOnly( dst, dst.channels() );

// pass 2 arrays to the kernel and run it
size_t globalThreads[3] = { (size_t) src.cols, (size_t) src.rows, 1 };
if( !kernel.args( srcarg, dstarg ).run( 2, globalThreads, NULL, false ) )
    std::cerr << "executing kernel failed" << std::endl;
else
{
    std::cout << "SUCCESS!" << std::endl;

    cv::imshow( "src", src );
    cv::imshow( "dst", dst );

    cv::waitKey();
}

My cl file looks like this:

__kernel void test(
        __global const uchar *srcptr, int src_step, int src_offset,
        __global float *dstptr, int dst_step, int dst_offset, int dst_rows, int dst_cols
    )
{
    int x = get_global_id(0);
    int y = get_global_id(1);

    if (x < dst_cols && y < dst_rows)
    {
        int src_index = mad24(y, src_step, mad24(x, sizeof(uchar), src_offset));
        int dst_index = mad24(y, dst_step, mad24(x, sizeof(float), dst_offset));

        __global const uchar *src = (__global const uchar*)(srcptr + src_index);
        __global float* dst = (__global float*)(dstptr + dst_index);

        dst[0] = 1.0f - src[0] / 255.0f;
    }
}

The output this gives me is the following:

SUCCESS!

OpenCV Error: Assertion failed (clEnqueueReadBuffer(q, (cl_mem)u->handle, CL_TRUE, 0, u->size, alignedPtr.getAlignedPtr(), 0, 0, 0) == CL_SUCCESS) in cv::ocl::OpenCLAllocator::map, file C:\build\master_winpack-build-win64-vc14\opencv\modules\core\src\ocl.cpp, line 4773

OpenCV Error: Assertion failed (u->refcount == 0 && "UMat deallocation error: some derived Mat is still alive") in cv::ocl::OpenCLAllocator::deallocate, file C:\build\master_winpack-build-win64-vc14\opencv\modules\core\src\ocl.cpp, line 4528

When I change float to uchar in the CL code and CV_32FC1 to CV_8UC1, it works. I tried to orient on CL code of the OpenCV core, however I could not find any samples for writing float data in CL kernels in the OpenCL source code. Which is weird since you do have to deal with floating point data, don't you? But all your kernels, e.g. in opencl_kernels_core.cpp use uchar* desptr as an argument.

1) Why are there no float implementations?

2) How would I create a kernel writing float data?

Thanks!

2017-03-15 06:51:46 -0600 commented question UMat/Mat lifetime warning

I'll try to find it and extract it somehow, the codebase is pretty huge, so I am afraid this will be difficult. I was hoping that there are some general guidelines for using the ocl implementation of OpenCV, which could hint me to a possible problem... Is there something like a how-to/troubleshooting/common mistakes for porting to ocl?

2017-03-15 05:56:25 -0600 commented answer Operators for cv::UMat

Thanks for your input. I get correct results using the following: x = x.mul( 2.1 );, I don't know however if this adds overhead. Is there extra copying taking place due to the assignement operator?

2017-03-15 05:44:08 -0600 asked a question UMat/Mat lifetime warning

Hi, I'm currently trying to port my OpenCV 3.1 code to make use of OpenCL by replacing Mat objects with UMat and make adjustments accordingly. Now I saw the following warning:

! OPENCV warning: getUMat()/getMat() call chain possible problem.
!                 Base object is dead, while nested/derived object is still alive or processed.
!                 Please check lifetime of UMat/Mat objects!

What does this actually mean? I had a look into the source code of umatrix.cpp and saw that in the destructor the warning is shown when u->refcount is zero and u->urefcount is greater zero, but also when both are zero.

I'm not at all familiar with the in-depth implementation of OpenCV let alone the ocl part of it, so I have some trouble understanding what I'm doing wrong, so I would highly appreciate any related hints, thanks!

2017-03-14 12:55:45 -0600 received badge  Popular Question (source)
2017-03-14 10:56:17 -0600 asked a question Operators for cv::UMat

I noticed that there are some code incompatibilities between cv::Mat and cv::UMat -- right now I'm stuck with some missing operators. I found functions for element-wise operations with other matrices, but what is the best way to e.g. just scale the values of a UMat by a float/double or to add a fixed value to all elements?

2017-02-17 06:00:59 -0600 commented question Setting VideoCapture resolution fails

Am I understanding you correctly that you suggest to capture 640x480 and then resize to Full HD?

2017-02-17 02:55:03 -0600 received badge  Enthusiast
2016-03-10 04:22:42 -0600 commented question Setting VideoCapture resolution fails

It does support Full HD, I tried with AMCap, VLC, and also with my DirectShow code - no problems whatsoever. So you're saying, this works for you? I'm asking because I can't remember a single time I succeeded in doing this, also with other cameras, so I thought this is not properly supported in OpenCV, just as enumerating cameras. But I was still using the old C interface until recently, so I thought that's the reason.

2016-03-09 05:18:38 -0600 asked a question Setting VideoCapture resolution fails

Hi,

I'm trying to set VideoCapture resolution as follows, but it doesn't seem to do anything. The camera does support the requested format. I also tried different resolutions. No matter what I do, my VideoCapture is created with VGA resolution. Also, some other settings I'm trying to query do not seem to be supported.

VideoCapture *capture = new VideoCapture(0);

if( !capture->set( CAP_PROP_FRAME_WIDTH, 1920 ) || !capture->set( CAP_PROP_FRAME_HEIGHT, 1080 ) )
{
    //not print - both set calls return 'true'
    cerr << "setting resolution to full HD failed!" << std::endl;
}

double fps = capture->get( CAP_PROP_POS_FRAMES );
int width = capture->get( CAP_PROP_FRAME_WIDTH ); // returns 640
int height = capture->get( CAP_PROP_FRAME_HEIGHT ); // returns 480
int format = capture->get( CAP_PROP_FORMAT );

I remember those things never worked for me in the past, I'm wondering if this is implemented at all or if I'm making a mistake. OS: Windows 8.1, camera: MS LifeCam Studio, OpenCV: 3.1

Thanks, iko

2016-02-24 09:26:22 -0600 received badge  Teacher (source)
2016-02-24 07:41:43 -0600 received badge  Self-Learner (source)
2016-02-24 07:40:24 -0600 answered a question Most efficient way to clear an image with C++ interface?

Okay, in case anybody else is wondering about the same, here are my findings:

As pointed out in the comments already (thanks), cv::Mat::zeros seems to replace cvZero and cv::Mat::setTo replaces cvSet. What I now learned about memory management in the C++ interface here clarifies some obscurity. From diving into the source code, I learned that e.g. myMat = cv::Mat::zeros( ... );, while looking totally like an inefficient assignment, in reality isn't one. It rather creates an expression which in combination with the overloaded assignment operator for Mat = MatExp just applies an operation on the existing matrix, in case size and format match.

I must say, while being convenient, the C++ interface is not really intuitive. Things like that aren't self-explanatory and I think they should be pointed out much clearer to OpenCV beginners, especially to those with strong C++ background. To a general programming beginner it might be fine, but for experienced C++ programmers, writing some expressions surely sets off the alarm bells. In case I missed a detailled explaination about those mechanisms in the documentation, I would be glad about somebody pointing me to it.

2016-02-24 03:22:29 -0600 commented question Most efficient way to clear an image with C++ interface?

Frankly, I don't understand what you're asking - I use cvZero exactly as I wrote above, how else would I use it? cvZero( im ); The reason I am worried about unnecessary memory allocation and deallocation is of course performance, which is traditionally a consideration in most computer vision applications. I don't want to allocate and deallocate hundreds of matrixes each frame. When I use a computer vision library, I really try to use it in the most efficient way and to avoid unnecessary operations.

2016-02-23 08:50:16 -0600 commented question Most efficient way to clear an image with C++ interface?

thanks, I misread. not sure what blocks are for, though. As a matter of fact, setTo is a lot slower, compared to cvZero. Here is the output of a very basic benchmark I did (image size is Full HD, 8 bit single channel):

1000 x testMat.setTo( cv::Scalar( 0x00 ) ); took 99.3829ms
1000 x testMat = cv::Mat::zeros( testMat.size(), testMat.type() ); took 58.3239ms
1000 x cvZero( im ); took 60.2803ms

Mat::zeros is, however, faster than cvZero. Mat::setTo on the other hand is a lot faster than the old C API's cvSet:

1000 x testMat.setTo( cv::Scalar( 0xff ) ); took 100.06ms
1000 x cvSet( im, cvScalar( 0xff ) ); took 470.597ms
2016-02-22 06:06:25 -0600 asked a question Most efficient way to clear an image with C++ interface?

I'm currently porting my old OpenCV C code to the C++ interface of OpenCV 2/3 and I'm not quite sure about some equivalents for old functions. Pretty early I ran into an issue with cvZero. The only possibility I found was to set the matrix content via Mat::setTo. Now, having to be able to manage multi-channel scalars and different data types, setTo iterates through all elements of the matrix and sets them one after another while cvZero basically did a memset. Also, I read that using setTo is a lot slower compared to simply doing this:

myMat = cv::Mat::zeros( myMat.size(), myMat.type() );

Still, I'm not sure if this allocates a new matrix and frees the old one, which I also wouldn't want. Of course I could always write

memset( myMat.data, 0, myMat.size().width * myMat.size().height * myMat.depth() );

but isn't there a proper convenience function for this?

So, I am wondering what would be the recommended way for using the C++ interface, in case I just want to clear my image black.

Thanks!

2016-02-22 05:58:25 -0600 received badge  Critic (source)
2016-01-22 06:03:55 -0600 received badge  Student (source)
2016-01-22 05:04:55 -0600 asked a question OpenGL matrix from OpenCV matrix

Hi folks,

I desperately need some guideance here. This is actually supposed to be very simple as long as one has the necessary information at hand, but I can't find it in the documentation:

What I'm trying to do is to get a valid OpenGL matrix from an OpenCV matrix. I tried a lot but somehow I can't seem to get it right since there are too many unknowns. In detail, I'm using solvePnP to figure out the pose of a checker pattern and I'm trying to calculate the camera-to-object transformation matrix.

I would really appreciate somebody posting just the few lines of code it takes to do this correctly.

Bonus questions (but really I'm fine with just a few lines of code or a link to a working sample or detailled explaination, I'm confident that I can figure this out by my own as soon as I've got the code up and running):

  1. Are you using row-major or column-major layout?
  2. Are you using right handed or left handed coordinate frames?
  3. Which axes directions are you commonly using, so what's up, what's right, what's forward?

Thanks a ton, iko

2016-01-13 03:09:11 -0600 received badge  Supporter (source)
2014-04-03 12:01:54 -0600 received badge  Editor (source)
2014-04-03 12:00:48 -0600 asked a question cvCalibrateCamera2 producing varying results

Hi folks,

I have a question that has been asked several times before (although not here), but didn't seem to receive an answer so far, at least I couldn't find one.

I'm trying to get accurate intrinsic parameters for my cameras (IR-sensors of Kinects and Asus Xtions) for some time now and I always seem to fail because cvCalibrateCamera2 reports different results every time. I use a checker board of 8x6 30mm-squares and capture 20 VGA frames for the calibration. In the past I tried calculating intrinsics from scratch (flags=0) and found that the results are not satisfying at all. Then I tried with CV_CALIB_USE_INTRINSIC_GUESS, using initial paramerters of f=570.34, cx=320, cy=240. (Aside from that I use the flags for fixing aspect ratio, zero tangent dist and k1 thru k3). From this I get strongly varying results again. I did several iterations, each of which building in the result of the previous ones (I realize that the matrices here are transposed):

initial matrix: [ 570, 0, 0] [ 0, 570, 0] [ 320, 240, 1]

after iteration 1: [ 595, 0, 0] [ 0, 595, 0] [ 320, 239, 1]

after iteration 2: [ 561, 0, 0] [ 0, 561, 0] [ 317, 230, 1]

after iteration 3: [ 609, 0, 0] [ 0, 609, 0] [ 325, 239, 1]

after iteration 4: [ 559, 0, 0] [ 0, 559, 0] [ 326, 226, 1]

after iteration 5: [ 564, 0, 0] [ 0, 564, 0] [ 313, 237, 1]

after iteration 6: [ 569, 0, 0] [ 0, 569, 0] [ 319, 228, 1]

Does this make any sense to you? I would highly appreciate somebody giving me some advise or even some guess what I might be doing wrong since I'm already struggling for some time with this.

Thanks!