trie's profile - activity

overview network karma followed questions activity

2012-10-30 18:36:54 -0600	commented answer	test NEON-optimized cv::threshold() on mobile device That is, I happened to have a program (I'm currently trying to optimize with neon....), running on an arm, which uses threshold (at least for preprocessing). I came to this thread while searching the internet for "arm neon opencv". Regarding gaining "only x2": On http://hilbert-space.de/?p=22 I have read, that using assembler instead of intrinsics might bring another performance boost, since the compiler didn't optimized the register-usage very well. I haven't looked at the assembler-output (yet, will probably do that in the next days...), but maybe it's a similar case here. However I have very little knowledge of assembler (neither arm/neon, nor of the PC-world...), so that might not give much insight ;-)
2012-10-30 18:07:42 -0600	commented answer	test NEON-optimized cv::threshold() on mobile device normally the program processes images from a webcam. For this test (and other test of my own) I fed a video-file with a resolution of 800x600 into it. (The file was written with the opencv-video-writer as mjpeg. To limit the actual processing to "interesting" regions, in a first step there is a square-detector, loosely based on squares.cpp from the samples, but with adaptiveThreshold instead of canny (to work with differing light-conditions). That is the steps are: pyrDown pyrUp adaptiveThreshold(gray0, gray, 255, ADAPTIVE_THRESH_MEAN_C, THRESH_BINARY, kernel, athresh); dilate(gray, gray, Mat(), Point(-1, -1)); findContours(gray, contours...
2012-10-30 03:47:05 -0600	received badge	● Teacher (source)
2012-10-30 01:15:57 -0600	received badge	● Necromancer (source)
2012-10-29 16:24:23 -0600	answered a question	test NEON-optimized cv::threshold() on mobile device I have tried the patch on a beagleboard, running debian testing hardfloat (armhf) (based on opencv git commit 5777598). First I had some errors, mixing signed and unsigned data: /root/src/opencv/modules/imgproc/src/thresh.cpp: In function ‘void cv::thresh_8u(const cv::Mat&, cv::Mat&, uchar, uchar, int)’: /root/src/opencv/modules/imgproc/src/thresh.cpp:269:62: note: use -flax-vector-conversions to permit conversions between vectors with differing element types or numbers of subparts /root/src/opencv/modules/imgproc/src/thresh.cpp:269:62: error: cannot convert ‘uint8x16_t {aka __vector(16) __builtin_neon_uqi}’ to ‘int8x16_t {aka __vector(16) __builtin_neon_qi}’ for argument ‘1’ to ‘int8x8_t vget_low_s8(int8x16_t)’ /root/src/opencv/modules/imgproc/src/thresh.cpp:270:61: error: cannot convert ‘uint8x16_t {aka __vector(16) __builtin_neon_uqi}’ to ‘int8x16_t {aka __vector(16) __builtin_neon_qi}’ for argument ‘1’ to ‘int8x8_t vget_low_s8(int8x16_t)’ /root/src/opencv/modules/imgproc/src/thresh.cpp:294:62: error: cannot convert ‘uint8x16_t {aka __vector(16) __builtin_neon_uqi}’ to ‘int8x16_t {aka __vector(16) __builtin_neon_qi}’ for argument ‘1’ to ‘int8x8_t vget_low_s8(int8x16_t)’ /root/src/opencv/modules/imgproc/src/thresh.cpp:295:61: error: cannot convert ‘uint8x16_t {aka __vector(16) __builtin_neon_uqi}’ to ‘int8x16_t {aka __vector(16) __builtin_neon_qi}’ for argument ‘1’ to ‘int8x8_t vget_low_s8(int8x16_t)’ /root/src/opencv/modules/imgproc/src/thresh.cpp:317:62: error: cannot convert ‘uint8x16_t {aka __vector(16) __builtin_neon_uqi}’ to ‘int8x16_t {aka __vector(16) __builtin_neon_qi}’ for argument ‘1’ to ‘int8x8_t vget_low_s8(int8x16_t)’ /root/src/opencv/modules/imgproc/src/thresh.cpp:339:69: error: cannot convert ‘uint8x16_t {aka __vector(16) __builtin_neon_uqi}’ to ‘int8x16_t {aka __vector(16) __builtin_neon_qi}’ for argument ‘1’ to ‘int8x8_t vget_low_s8(int8x16_t)’ /root/src/opencv/modules/imgproc/src/thresh.cpp:339:108: error: cannot convert ‘uint8x16_t {aka __vector(16) __builtin_neon_uqi}’ to ‘int8x16_t {aka __vector(16) __builtin_neon_qi}’ for argument ‘1’ to ‘int8x8_t vget_low_s8(int8x16_t)’ /root/src/opencv/modules/imgproc/src/thresh.cpp:361:69: error: cannot convert ‘uint8x16_t {aka __vector(16) __builtin_neon_uqi}’ to ‘int8x16_t {aka __vector(16) __builtin_neon_qi}’ for argument ‘1’ to ‘int8x8_t vget_low_s8(int8x16_t)’ /root/src/opencv/modules/imgproc/src/thresh.cpp:361:108: error: cannot convert ‘uint8x16_t {aka __vector(16) __builtin_neon_uqi}’ to ‘int8x16_t {aka __vector(16) __builtin_neon_qi}’ for argument ‘1’ to ‘int8x8_t vget_low_s8(int8x16_t)’ make[2]: * [modules/imgproc/CMakeFiles/opencv_imgproc.dir/src/thresh.cpp.o] Fehler 1 make[2]: Leaving directory `/root/src/opencv/build' make[1]: * [modules/imgproc/CMakeFiles/opencv_imgproc.dir/all] Fehler 2 make[1]: Leaving directory `/root/src/opencv/build' make: *** [all] Fehler 2 I then replaced "vget_low_s8" in those lines with "vget_low_u8", then it did compile. I then tested with a program, which uses threshold for some of its work (the main-work is in other functions) and used oprofile on it: "opreport -l -g -D smart ../build/src/imgproc\|grep -i thresh" without the patch: `1054 3.5127 thresh.cpp:794 imgproc cv::adaptiveThreshold(cv::_InputArray const&, cv::_OutputArray const&, double, int, int, int, double) 456 1.5197 thresh.cpp:677 imgproc cv::ThresholdRunner::operator()(cv::Range const&) const 3 0.0100 thresh.cpp:712 imgproc cv::threshold(cv::_InputArray const&, cv::_OutputArray const&, double ...` (more)