Ask Your Question

Revision history [back]

click to hide/show revision 1
initial version

OpenCV 3.0, cv::Canny (TBB version) fails while running in parallel threads

Hello, everyone. I have a project under Qt 5.5, TBB 4.4, OpenCV 3.0.0 (compiled with IPP and TBB). Building under MSVS 2012 u5. I have 3 equal parallel threads (but for 2 this isssue is also actual). In each of them I process a 100Hz video stream (1280x1024x8bit) coming from my device. While frame preprocessing I call cv::Canny(src, dst, 50, 200, 3, true) in each of 3 parallel processing loops. I am catching exception in canny.cpp in the following section of code:

#define CANNY_PUSH_SERIAL(d)    *(d) = uchar(2), borderPeaks.push(d)

// now track the edges (hysteresis thresholding)
uchar* m;
while (borderPeaks.try_pop(m))
{
    if (!m[-1])         CANNY_PUSH_SERIAL(m - 1);
    if (!m[1])          CANNY_PUSH_SERIAL(m + 1);
    if (!m[-mapstep-1]) CANNY_PUSH_SERIAL(m - mapstep - 1);
    if (!m[-mapstep])   CANNY_PUSH_SERIAL(m - mapstep);
    if (!m[-mapstep+1]) CANNY_PUSH_SERIAL(m - mapstep + 1);
    if (!m[mapstep-1])  CANNY_PUSH_SERIAL(m + mapstep - 1);
    if (!m[mapstep])    CANNY_PUSH_SERIAL(m + mapstep);
    if (!m[mapstep+1])  CANNY_PUSH_SERIAL(m + mapstep + 1);
}

I figured out that when it happens, 2 different threads are trying to get access to object referenced by m that is equal in both threads. My guess that this is happening because of borderPeaks queue is defined as static global object.

#ifdef HAVE_TBB

// Queue with peaks that will processed serially.
static tbb::concurrent_queue<uchar*> borderPeaks;

So it appears that i can't run TBB version of cv::Canny in multiple parallel threads. Compiling without TBB support fixes the issue. Or even compiling with TBB-dependent section of cv::Canny being commented also fixes this. But in both the performance of cv::Canny is decreasing.

Is this a bug, and do i have any performance saving workaround?

OpenCV 3.0, cv::Canny (TBB version) fails while running in parallel threads

Hello, everyone. I have a project under Qt 5.5, TBB 4.4, OpenCV 3.0.0 (compiled with IPP and TBB). Building under MSVS 2012 u5. I have 3 equal parallel threads (but for 2 this isssue is also actual). In each of them I process a 100Hz video stream (1280x1024x8bit) coming from my device. While In frame preprocessing I call cv::Canny(src, dst, 50, 200, 3, true) in each of 3 parallel processing loops. I am catching exception in canny.cpp in the following section of code:

#define CANNY_PUSH_SERIAL(d)    *(d) = uchar(2), borderPeaks.push(d)

// now track the edges (hysteresis thresholding)
uchar* m;
while (borderPeaks.try_pop(m))
{
    if (!m[-1])         CANNY_PUSH_SERIAL(m - 1);
    if (!m[1])          CANNY_PUSH_SERIAL(m + 1);
    if (!m[-mapstep-1]) CANNY_PUSH_SERIAL(m - mapstep - 1);
    if (!m[-mapstep])   CANNY_PUSH_SERIAL(m - mapstep);
    if (!m[-mapstep+1]) CANNY_PUSH_SERIAL(m - mapstep + 1);
    if (!m[mapstep-1])  CANNY_PUSH_SERIAL(m + mapstep - 1);
    if (!m[mapstep])    CANNY_PUSH_SERIAL(m + mapstep);
    if (!m[mapstep+1])  CANNY_PUSH_SERIAL(m + mapstep + 1);
}

I figured out that when it happens, 2 different threads are trying to get access to object referenced by m that is equal in both threads. My guess that this is happening because of borderPeaks queue is defined as static global object.

#ifdef HAVE_TBB

// Queue with peaks that will processed serially.
static tbb::concurrent_queue<uchar*> borderPeaks;

So it appears that i can't run TBB version of cv::Canny in multiple parallel threads. Compiling without TBB support fixes the issue. Or even compiling with TBB-dependent section of cv::Canny being commented also fixes this. But in both the performance of cv::Canny is decreasing.

Is this a bug, and do i have any performance saving workaround?

OpenCV 3.0, cv::Canny (TBB version) fails while running in parallel threads

Hello, everyone. I have a project under Qt 5.5, TBB 4.4, OpenCV 3.0.0 (compiled with IPP and TBB). Building under MSVS 2012 u5. I have 3 equal parallel threads (but for 2 this isssue is also actual). In each of them I process a 100Hz video stream (1280x1024x8bit) coming from my device. In frame preprocessing I call cv::Canny(src, dst, 50, 200, 3, true) in each of 3 parallel processing loops. I am catching exception in canny.cpp in the following section of code:

#define CANNY_PUSH_SERIAL(d)    *(d) = uchar(2), borderPeaks.push(d)

// now track the edges (hysteresis thresholding)
uchar* m;
while (borderPeaks.try_pop(m))
{
    if (!m[-1])         CANNY_PUSH_SERIAL(m - 1);
    if (!m[1])          CANNY_PUSH_SERIAL(m + 1);
    if (!m[-mapstep-1]) CANNY_PUSH_SERIAL(m - mapstep - 1);
    if (!m[-mapstep])   CANNY_PUSH_SERIAL(m - mapstep);
    if (!m[-mapstep+1]) CANNY_PUSH_SERIAL(m - mapstep + 1);
    if (!m[mapstep-1])  CANNY_PUSH_SERIAL(m + mapstep - 1);
    if (!m[mapstep])    CANNY_PUSH_SERIAL(m + mapstep);
    if (!m[mapstep+1])  CANNY_PUSH_SERIAL(m + mapstep + 1);
}

I figured out that when it happens, 2 different threads are trying to get access to object referenced by m that is equal in both threads. My guess that this is happening because of borderPeaks queue is defined as static global object.

#ifdef HAVE_TBB

// Queue with peaks that will processed serially.
static tbb::concurrent_queue<uchar*> borderPeaks;

So it appears that i can't run TBB version of cv::Canny in multiple parallel threads. Compiling without TBB support fixes the issue. Or even compiling with TBB-dependent section of cv::Canny being commented also fixes this. But in both the performance of cv::Canny is decreasing.

Is this a bug, and do i have any performance saving workaround?

PS. I forgot to say, that I've got this issue only when running Release version. Debug build is working fine.