How can I speed up filter2D

asked 2013-03-29 17:00:51 -0600

rgadde
69 ●1 ●2 ●7

Hi,

My algorithm calls filter2D around 256 times. Because of this the processing time of the overall algorithm is high. I tried all other ways to minimize the number of convolutions(by scaling image etc). Is there any alternative function or something for reducing the time of filter2D(without using gpu etc).

edit retag flag offensive close merge delete

Comments

just adding a comment here, to get notified of the outcome ;)

(only using 8 gabor filters here, but the time used for it still .. )

hmm, sidenote, don't you want to upgrade your 2.1 opencv ?

berak ( 2013-03-29 17:32:06 -0600 )edit

@rgadde Did you apply filters to one source? In case of multiple source images you can try to parallelize it, with TBB for example.

Daniil Osokin ( 2013-03-30 02:15:00 -0600 )edit

@berak I am using 16 gabor filters. I wish I could use latest version of opencv. Problem is I am not able to cross compile opencv 2.4 on ARM6. I tried my best. I already have opencv2.1 libraries so, I downgraded my code too :(.

rgadde ( 2013-04-01 07:40:24 -0600 )edit

@Daniil Osokin Thanks for the comment. I think in my case I can't use TBB. I am trying to use my code on ARM 11 processor which has single core. Please, correct me if I am wrong

rgadde ( 2013-04-01 08:00:59 -0600 )edit

Please me the result,when you use TBB and opencv function. I think there are some problem when used them mixed. Cause opencv core has TBB.

But the best way is to separate kernel matrix to cols & rows.

wuling ( 2013-04-01 12:31:19 -0600 )edit

add a comment

3 answers

Sort by » oldest newest most voted

answered 2013-04-02 01:15:57 -0600

sammy
3029 ●14 ●29 ●48

updated 2013-04-02 10:22:53 -0600

There are a number of tricks to speed up filtering. However, none of them is an off-the-shelf solution.

Separable filters. Some 2D filters have the mathematical property that there are two one-dimensional filters that, applied consecutively, have the same effect on the image as the original 2D one. By example, gaussian blur, with a window size of 9x9 (81 elements, so ~81 muliplications per each pixel) can be reduced to a vertical filter of 9x1 and a horizontal filter of 1x9. That means 18 muliplications for each pixel. This applies to most symmetrical filters. However, it does not apply to more complex ones, like rotated Gabor kernels.
Convolve filters If you apply multiple filters on the same image consecutively, like a gaussian blur, then a Gabor filter, you can combine them together. Make all filters the same size and convolve them. Then apply the result on the image. Math says the effect will be identical with the previous combination
Fast fourier transform If your filters are really big (e.g. 30x30 or bigger) you can apply FFT on the image and the kernel, than use the nice property of FFT to transform convolution into addition. Check some math textbooks for more details.
SIMD processing. OpenCV filters are SIMD-accelerated (most of them) for x86 architectures. For now, there is not much NEON-enabled code (SIMD technology for ARM); however, I had very good results (with filter2D) by translating SSE instructions into NEON. Read the bottom part of this post for more details: http://answers.opencv.org/question/755/object-detection-slow/#760
Reduce the number of filter2D calls. You said you already did it, but never say never. Probably you use them to extract some features - then you can decide earlier whether a part of the image contains the needed features, and, if not, skip filtering. Or you apply a bank of filters at different angles - then 256 steps is too much - 8 or 16 steps is usually more than enough.
Find another algorithm. Haar-based detector, by Viola & Jones, is a great tool in image processing. And quite accurate. The same is true for HOG (Histogram of Oriented Gradients). However, both of them are superseded in terms of popularity by LBP (Local Binary Patterns), an algorithms that is less accurate, but much faster. Both HOG and Haar use floating-point calculations. LBP only does bit twindling. And people prefer LBP. You can find a different way to solve your problem - maybe with the same accuracy, but without those costly calls. The best approach here is to read a lot - learn what others have done in your field, and even in different computer vision areas - and get ideas.
Post a description on your algorithm here, together with images/graphics, etc. We may have specific ideas or advice on your work. You are probably not the first one to have this problem.

edit flag offensive delete link

Comments

@sammy Thanks a lot.It was really useful. To address my problem, I am using 16 gabor filters(different orientations) multiple times to detect edges in an image and as well to suppress noise link. I think I can't use either separable filters or convolve the filters as I was using gabor filters. But I found a discussion in mathworks forum, which I thought might be useful. Also,I am on ARMv6 processor which doesn't support NEON optimization. I think I need to find an alternate algorithm for reducing the processing time.

rgadde ( 2013-04-02 09:19:40 -0600 )edit

I will look into the paper a bit later. Meantime, I found I forgot FFT.

sammy ( 2013-04-02 10:19:34 -0600 )edit

add a comment

answered 2020-08-12 10:15:30 -0600

Zana Zakaryaie Nejad
31 ●2 ●2

I have implemented a fast Gaussian-blur in C++ and compared the performance to OpenCV on Raspberry Pi 3B+ running 32bit Raspbian OS. The function uses all the 4 cores of the Raspberry Pi and works 2-3 times faster than OpenCV. The boost is even more on 64bit OS. Here is the link to code with documentation: https://github.com/zanazakaryaie/fast...