OpenCV 3.0, cv::Canny (TBB version) fails while running in parallel threads

asked 2015-09-04 09:18:24 -0500

MFmaniak gravatar image

updated 2015-09-04 09:32:51 -0500

Hello, everyone. I have a project under Qt 5.5, TBB 4.4, OpenCV 3.0.0 (compiled with IPP and TBB). Building under MSVS 2012 u5. I have 3 equal parallel threads (but for 2 this isssue is also actual). In each of them I process a 100Hz video stream (1280x1024x8bit) coming from my device. In frame preprocessing I call cv::Canny(src, dst, 50, 200, 3, true) in each of 3 parallel processing loops. I am catching exception in canny.cpp in the following section of code:

#define CANNY_PUSH_SERIAL(d)    *(d) = uchar(2), borderPeaks.push(d)

// now track the edges (hysteresis thresholding)
uchar* m;
while (borderPeaks.try_pop(m))
    if (!m[-1])         CANNY_PUSH_SERIAL(m - 1);
    if (!m[1])          CANNY_PUSH_SERIAL(m + 1);
    if (!m[-mapstep-1]) CANNY_PUSH_SERIAL(m - mapstep - 1);
    if (!m[-mapstep])   CANNY_PUSH_SERIAL(m - mapstep);
    if (!m[-mapstep+1]) CANNY_PUSH_SERIAL(m - mapstep + 1);
    if (!m[mapstep-1])  CANNY_PUSH_SERIAL(m + mapstep - 1);
    if (!m[mapstep])    CANNY_PUSH_SERIAL(m + mapstep);
    if (!m[mapstep+1])  CANNY_PUSH_SERIAL(m + mapstep + 1);

I figured out that when it happens, 2 different threads are trying to get access to object referenced by m that is equal in both threads. My guess that this is happening because of borderPeaks queue is defined as static global object.

#ifdef HAVE_TBB

// Queue with peaks that will processed serially.
static tbb::concurrent_queue<uchar*> borderPeaks;

So it appears that i can't run TBB version of cv::Canny in multiple parallel threads. Compiling without TBB support fixes the issue. Or even compiling with TBB-dependent section of cv::Canny being commented also fixes this. But in both the performance of cv::Canny is decreasing.

Is this a bug, and do i have any performance saving workaround?

PS. I forgot to say, that I've got this issue only when running Release version. Debug build is working fine.

edit retag flag offensive close merge delete


I'm not sure to understand. but both thread need to access to m without mutex it's not very usual.

Have you try same thing after clonning your data just to test if it solves your problem?

LBerger gravatar imageLBerger ( 2015-09-04 12:48:13 -0500 )edit

Just to be clear - the code in post is a part of cv::Canny source code from "canny.cpp" (OpenCV 3.0.0). I've found a workaround that suites my needs. I've changed cv::Canny definition to pass a custom parameter.

CV_EXPORTS_W void Canny( InputArray image, OutputArray edges,
                         double threshold1, double threshold2,
                         int apertureSize = 3, bool L2gradient = false, void* bp = NULL);

And in my worker object (one for each parallel thread) i create an instance of

tbb::concurrent_queue<uchar*> borderPeaks

and pass a reference to it in cv::Canny() call. So the data parts that are pushed into borderPeaks queue inside cv::Canny are not mixed between threads.

MFmaniak gravatar imageMFmaniak ( 2015-09-07 00:40:03 -0500 )edit

I'm still wondering if this is a bug in OpenCV. Can someone make it clear for me? Thanks in advance.

MFmaniak gravatar imageMFmaniak ( 2015-09-07 00:41:14 -0500 )edit

There is this post and this one (but as you 've got three mat is not your case)

I have test canny with VS2013 (with or without opencl) in release mode there is no problem with one thread.

Now if you think something is wrong in code you can post an issue (with a small sample is better)

LBerger gravatar imageLBerger ( 2015-09-07 01:22:51 -0500 )edit

Thanks for quick reply. I'll post this as an issue. The problem isn't in thread-safety of the cv::Mat itself. The data, that is processed in my threads, doesn't intersect at all. I process 3 different videostreams.

MFmaniak gravatar imageMFmaniak ( 2015-09-07 02:07:53 -0500 )edit

Issue fixed in OpenCV 3.2.0

MFmaniak gravatar imageMFmaniak ( 2017-01-10 00:25:31 -0500 )edit