2012-10-30 18:36:54 -0600 | commented answer | test NEON-optimized cv::threshold() on mobile device That is, I happened to have a program (I'm currently trying to optimize with neon....), running on an arm, which uses threshold (at least for preprocessing). I came to this thread while searching the internet for "arm neon opencv". Regarding gaining "only x2": On http://hilbert-space.de/?p=22 I have read, that using assembler instead of intrinsics might bring another performance boost, since the compiler didn't optimized the register-usage very well. I haven't looked at the assembler-output (yet, will probably do that in the next days...), but maybe it's a similar case here. However I have very little knowledge of assembler (neither arm/neon, nor of the PC-world...), so that might not give much insight ;-) |
2012-10-30 18:07:42 -0600 | commented answer | test NEON-optimized cv::threshold() on mobile device normally the program processes images from a webcam. For this test (and other test of my own) I fed a video-file with a resolution of 800x600 into it. (The file was written with the opencv-video-writer as mjpeg. To limit the actual processing to "interesting" regions, in a first step there is a square-detector, loosely based on squares.cpp from the samples, but with adaptiveThreshold instead of canny (to work with differing light-conditions). That is the steps are: pyrDown pyrUp adaptiveThreshold(gray0, gray, 255, ADAPTIVE_THRESH_MEAN_C, THRESH_BINARY, kernel, athresh); dilate(gray, gray, Mat(), Point(-1, -1)); findContours(gray, contours... |
2012-10-30 03:47:05 -0600 | received badge | ● Teacher (source) |
2012-10-30 01:15:57 -0600 | received badge | ● Necromancer (source) |
2012-10-29 16:24:23 -0600 | answered a question | test NEON-optimized cv::threshold() on mobile device I have tried the patch on a beagleboard, running debian testing hardfloat (armhf) (based on opencv git commit 5777598). First I had some errors, mixing signed and unsigned data: I then replaced "vget_low_s8" in those lines with "vget_low_u8", then it did compile. I then tested with a program, which uses threshold for some of its work (the main-work is in other functions) and used oprofile on it: "opreport -l -g -D smart ../build/src/imgproc|grep -i thresh" without the patch: (more) |