Ask Your Question

How can I speed up filter2D

asked 2013-03-29 17:00:51 -0600

rgadde gravatar image


My algorithm calls filter2D around 256 times. Because of this the processing time of the overall algorithm is high. I tried all other ways to minimize the number of convolutions(by scaling image etc). Is there any alternative function or something for reducing the time of filter2D(without using gpu etc).

edit retag flag offensive close merge delete



just adding a comment here, to get notified of the outcome ;)

(only using 8 gabor filters here, but the time used for it still .. )

hmm, sidenote, don't you want to upgrade your 2.1 opencv ?

berak gravatar imageberak ( 2013-03-29 17:32:06 -0600 )edit

@rgadde Did you apply filters to one source? In case of multiple source images you can try to parallelize it, with TBB for example.

Daniil Osokin gravatar imageDaniil Osokin ( 2013-03-30 02:15:00 -0600 )edit

@berak I am using 16 gabor filters. I wish I could use latest version of opencv. Problem is I am not able to cross compile opencv 2.4 on ARM6. I tried my best. I already have opencv2.1 libraries so, I downgraded my code too :(.

rgadde gravatar imagergadde ( 2013-04-01 07:40:24 -0600 )edit

@Daniil Osokin Thanks for the comment. I think in my case I can't use TBB. I am trying to use my code on ARM 11 processor which has single core. Please, correct me if I am wrong

rgadde gravatar imagergadde ( 2013-04-01 08:00:59 -0600 )edit

Please me the result,when you use TBB and opencv function. I think there are some problem when used them mixed. Cause opencv core has TBB.

But the best way is to separate kernel matrix to cols & rows.

wuling gravatar imagewuling ( 2013-04-01 12:31:19 -0600 )edit

3 answers

Sort by ยป oldest newest most voted

answered 2013-04-02 01:15:57 -0600

sammy gravatar image

updated 2013-04-02 10:22:53 -0600

There are a number of tricks to speed up filtering. However, none of them is an off-the-shelf solution.

  • Separable filters. Some 2D filters have the mathematical property that there are two one-dimensional filters that, applied consecutively, have the same effect on the image as the original 2D one. By example, gaussian blur, with a window size of 9x9 (81 elements, so ~81 muliplications per each pixel) can be reduced to a vertical filter of 9x1 and a horizontal filter of 1x9. That means 18 muliplications for each pixel. This applies to most symmetrical filters. However, it does not apply to more complex ones, like rotated Gabor kernels.

  • Convolve filters If you apply multiple filters on the same image consecutively, like a gaussian blur, then a Gabor filter, you can combine them together. Make all filters the same size and convolve them. Then apply the result on the image. Math says the effect will be identical with the previous combination

  • Fast fourier transform If your filters are really big (e.g. 30x30 or bigger) you can apply FFT on the image and the kernel, than use the nice property of FFT to transform convolution into addition. Check some math textbooks for more details.

  • SIMD processing. OpenCV filters are SIMD-accelerated (most of them) for x86 architectures. For now, there is not much NEON-enabled code (SIMD technology for ARM); however, I had very good results (with filter2D) by translating SSE instructions into NEON. Read the bottom part of this post for more details:

  • Reduce the number of filter2D calls. You said you already did it, but never say never. Probably you use them to extract some features - then you can decide earlier whether a part of the image contains the needed features, and, if not, skip filtering. Or you apply a bank of filters at different angles - then 256 steps is too much - 8 or 16 steps is usually more than enough.

  • Find another algorithm. Haar-based detector, by Viola & Jones, is a great tool in image processing. And quite accurate. The same is true for HOG (Histogram of Oriented Gradients). However, both of them are superseded in terms of popularity by LBP (Local Binary Patterns), an algorithms that is less accurate, but much faster. Both HOG and Haar use floating-point calculations. LBP only does bit twindling. And people prefer LBP. You can find a different way to solve your problem - maybe with the same accuracy, but without those costly calls. The best approach here is to read a lot - learn what others have done in your field, and even in different computer vision areas - and get ideas.

  • Post a description on your algorithm here, together with images/graphics, etc. We may have specific ideas or advice on your work. You are probably not the first one to have this problem.

edit flag offensive delete link more


@sammy Thanks a lot.It was really useful. To address my problem, I am using 16 gabor filters(different orientations) multiple times to detect edges in an image and as well to suppress noise link. I think I can't use either separable filters or convolve the filters as I was using gabor filters. But I found a discussion in mathworks forum, which I thought might be useful. Also,I am on ARMv6 processor which doesn't support NEON optimization. I think I need to find an alternate algorithm for reducing the processing time.

rgadde gravatar imagergadde ( 2013-04-02 09:19:40 -0600 )edit

I will look into the paper a bit later. Meantime, I found I forgot FFT.

sammy gravatar imagesammy ( 2013-04-02 10:19:34 -0600 )edit

answered 2020-08-12 10:15:30 -0600

Zana Zakaryaie Nejad gravatar image

I have implemented a fast Gaussian-blur in C++ and compared the performance to OpenCV on Raspberry Pi 3B+ running 32bit Raspbian OS. The function uses all the 4 cores of the Raspberry Pi and works 2-3 times faster than OpenCV. The boost is even more on 64bit OS. Here is the link to code with documentation:

edit flag offensive delete link more

answered 2013-04-01 12:25:21 -0600

Seb gravatar image

There's not much you can do then. Convulution is time-consuming.

edit flag offensive delete link more

Question Tools


Asked: 2013-03-29 17:00:51 -0600

Seen: 8,664 times

Last updated: Aug 12 '20