Ask Your Question

Revision history [back]

click to hide/show revision 1
initial version

Volunteer for fixing bug 1337 (rotate, remap crash when widthStep exceeds 15 bits)

Dear all,

I have been using OpenCV for general image processing (rotate, resize, etc.) at my workplace for 4 years. I am good at C++ and x86 SIMD programming (SSE2 and up).

Some of the images handled at my workplace are very large. As we try to rotate them, it is sometimes affected by Bug 1337, where remap will cause an invalid memory access because its widthStep exceeds the 15-bit integer (signed short int) limit.

http://code.opencv.org/issues/1337

I'm able to trace down to the cause of the error, and I have several approaches (resolutions) in mind. In fact, I have coded one of these resolutions for use at my workplace.

I would like to get contact with one of the maintainers of OpenCV, in the hope of either incorporating the fix, or to discuss and implement a better fix.

(Because of non-disclosure and copyright issues, I cannot directly contribute the code to the OpenCV code base. However, I can contribute my knowledge and free-time to help implement a fix by taking a similar approach.)

click to hide/show revision 2
added link to gist, (hoping won't be discovered by spambots.)

Volunteer for fixing bug 1337 (rotate, remap crash when widthStep exceeds 15 bits)

Dear all,

I have been using OpenCV for general image processing (rotate, resize, etc.) at my workplace for 4 years. I am good at C++ and x86 SIMD programming (SSE2 and up).

Some of the images handled at my workplace are very large. As we try to rotate them, it is sometimes affected by Bug 1337, where remap will cause an invalid memory access because its widthStep exceeds the 15-bit integer (signed short int) limit.

http://code.opencv.org/issues/1337

I'm able to trace down to the cause of the error, and I have several approaches (resolutions) in mind. In fact, I have coded one of these resolutions for use at my workplace.

I would like to get contact with one of the maintainers of OpenCV, in the hope of either incorporating the fix, or to discuss and implement a better fix.

(Because of non-disclosure and copyright issues, I legal issues (sign-off), the code cannot be used directly contribute the code to on the OpenCV code base. However, I can contribute my knowledge and free-time to help implement a fix by taking a similar approach.)approach. A sample of the code can be found on my Gist. )

Volunteer for fixing bug 1337 (rotate, remap crash when widthStep exceeds 15 bits)

Dear all,

I have been using OpenCV for general image processing (rotate, resize, etc.) at my workplace for 4 years. I am good at C++ and x86 SIMD programming (SSE2 and up).

Some of the images handled at my workplace are very large. As we try to rotate them, it is sometimes affected by Bug 1337, where remap will cause an invalid memory access because its widthStep exceeds the 15-bit integer (signed short int) limit.

http://code.opencv.org/issues/1337

I'm able to trace down to the cause of the error, and I have several approaches (resolutions) in mind. In fact, I have coded one of these resolutions for use at my workplace.

I would like to get contact with one of the maintainers of OpenCV, in the hope of either incorporating the fix, or to discuss and implement a better fix.

(Because of legal issues (sign-off), the code cannot be used directly on the OpenCV code base. However, I can contribute my knowledge and free-time to help implement a fix by taking a similar approach. A sample of the code can be found on my Gist. )


Solution 1 - scale down image before rotation, then scale up after.

  • Advantage
    • Easiest to implement
  • Disadvantage
    • Ugliest
    • Not even considered a fix - A future potential fix could generate a sharp image, which then won't be pixel-wise compatible (comparable) to the output from this ugly hack.

Solution 2 - Create tile crops from the input image, then compute the corresponding output pixels and stitch them together.

  • Technical clarification
    • The title is a misnomer. Actually, the output image dimensions is tiled, the corresponding input area (bounding rectangle) is computed and cropped, and the output pixels in each output tile is computed from each cropped input tile.
  • Advantage
    • Output is pixelwise correct, up to the last bit of rounding error.
    • Code is ready
      • See my Gist
      • Note: As stated on the file - My employer retains copyright (and other legal rights as well), therefore we must work together to create a public version that's legally clean. I'm here to (and have permissions) to offer help.
    • The code should be able to work with images over 65536 pixels, which isn't the case for solution #3 (SIMD-based minor fix, see below).
  • Disadvantage
    • Slow - a lot of data copying
    • Slow - did not make use of SIMD (SSE2)

Solution 3 - Implement a pseudo-instruction to make the SIMD code work on 16-bit unsigned integers, instead of 15-bit signed integer.

  • Technical clarification
    • The bug #1337 is caused by signed-extension of signed 16-bit values that occurs as part of the PMULLW instruction (_mm_mullo_epi16 intrinsic).
    • If the input values are treated as unsigned 16-bit, then the bug can be "stretched" to double its size. That is, it wouldn't happen unless the widthStep reaches 65536 bytes.
  • Advantage
    • Minimal impact on the speed of code (cv::remap).
  • Disadvantage
    • This doesn't fix the bug; it merely stretches (doubles) the range of images it could handle.

My current situation

I have coded up for solution 2, but at the same time I do not have time to make it as fast as the SIMD-optimized version in OpenCV. Therefore I am seeking help from the OpenCV community.

click to hide/show revision 4
Referred to the wrong SSE2 instruction. It might be PMULLW or PMADDWD.

Volunteer for fixing bug 1337 (rotate, remap crash when widthStep exceeds 15 bits)

Dear all,

I have been using OpenCV for general image processing (rotate, resize, etc.) at my workplace for 4 years. I am good at C++ and x86 SIMD programming (SSE2 and up).

Some of the images handled at my workplace are very large. As we try to rotate them, it is sometimes affected by Bug 1337, where remap will cause an invalid memory access because its widthStep exceeds the 15-bit integer (signed short int) limit.

http://code.opencv.org/issues/1337

I'm able to trace down to the cause of the error, and I have several approaches (resolutions) in mind. In fact, I have coded one of these resolutions for use at my workplace.

I would like to get contact with one of the maintainers of OpenCV, in the hope of either incorporating the fix, or to discuss and implement a better fix.

(Because of legal issues (sign-off), the code cannot be used directly on the OpenCV code base. However, I can contribute my knowledge and free-time to help implement a fix by taking a similar approach. A sample of the code can be found on my Gist. )


Solution 1 - scale down image before rotation, then scale up after.

  • Advantage
    • Easiest to implement
  • Disadvantage
    • Ugliest
    • Not even considered a fix - A future potential fix could generate a sharp image, which then won't be pixel-wise compatible (comparable) to the output from this ugly hack.

Solution 2 - Create tile crops from the input image, then compute the corresponding output pixels and stitch them together.

  • Technical clarification
    • The title is a misnomer. Actually, the output image dimensions is tiled, the corresponding input area (bounding rectangle) is computed and cropped, and the output pixels in each output tile is computed from each cropped input tile.
  • Advantage
    • Output is pixelwise correct, up to the last bit of rounding error.
    • Code is ready
      • See my Gist
      • Note: As stated on the file - My employer retains copyright (and other legal rights as well), therefore we must work together to create a public version that's legally clean. I'm here to (and have permissions) to offer help.
    • The code should be able to work with images over 65536 pixels, which isn't the case for solution #3 (SIMD-based minor fix, see below).
  • Disadvantage
    • Slow - a lot of data copying
    • Slow - did not make use of SIMD (SSE2)

Solution 3 - Implement a pseudo-instruction to make the SIMD code work on 16-bit unsigned integers, instead of 15-bit signed integer.

  • Technical clarification
    • The bug #1337 is caused by signed-extension of signed 16-bit values that occurs as part of the PMULLW instruction (or _mm_mullo_epi16PMADDWD intrinsic).instruction.
    • If the input values are treated as unsigned 16-bit, then the bug can be "stretched" to double its size. That is, it wouldn't happen unless the widthStep reaches 65536 bytes.
  • Advantage
    • Minimal impact on the speed of code (cv::remap).
  • Disadvantage
    • This doesn't fix the bug; it merely stretches (doubles) the range of images it could handle.

My current situation

I have coded up for solution 2, but at the same time I do not have time to make it as fast as the SIMD-optimized version in OpenCV. Therefore I am seeking help from the OpenCV community.