Revision history [back]

There are a number of tricks to speed up filtering. However, none of them is an off-the-shelf solution.

Separable filters. Some 2D filters has the mathematical property that there are two one-dimensional filters that, applied consecutively, have the same effect on the image as the original 2D one. By example, gaussian blur, with a window size of 9x9 (81 elements, so ~81 muliplications per each pixel) can be reduced to a vertical filter of 9x1 and a horizontal filter of 1x9. That means 18 muliplications for each pixel. This applies to most symmetrical filters. However, it does not apply to more complex ones, like rotated Gabor kernels.
SIMD processing. OpenCV filters are SIMD-accelerated (most of them) for x86 architectures. For now, there is not much NEON-enabled code (SIMD technology for ARM); however, I had very good results (with filter2D) by translating SSE instructions into NEON. Read the bottom part of this post for more details: http://answers.opencv.org/question/755/object-detection-slow/#760
Reduce the number of filter2D calls. You said you already did it, but never say never. Probably you use them to extract some features - then you can try to decide earlier whether a part of the image contains the needed features, and, if not, skip filtering. Or you apply a bank of filters at different angles - then 256 steps is too much - 8 or 16 steps is the best you usually get from an image. (You can make a more detailed description of your algorithm here, in order to get specific ideas). This is the most efficient way to speed it up.
Find another algorithm. Haar-based detector, by Viola & Jones, is a great tool in image processing. And quite accurate. The same is true for HOG (Histogram of Oriented Gradients). However, both of them are superseded in terms of popularity by LBP (Local Binary Patterns), an algorithms that is less accurate, but much faster. Both HOG and Haar use floating-point calculations. LBP only does bit twindling. People prefer LBP. You can find a different way to solve your problem - maybe with the same accuracy, but without those costly calls. The best approach here is to read a lot - leard what others have done in your field, and even in differend CV areas - and get ideas.

There are a number of tricks to speed up filtering. However, none of them is an off-the-shelf solution.

Separable filters. Some 2D filters has the mathematical property that there are two one-dimensional filters that, applied consecutively, have the same effect on the image as the original 2D one. By example, gaussian blur, with a window size of 9x9 (81 elements, so ~81 muliplications per each pixel) can be reduced to a vertical filter of 9x1 and a horizontal filter of 1x9. That means 18 muliplications for each pixel. This applies to most symmetrical filters. However, it does not apply to more complex ones, like rotated Gabor kernels.
SIMD processing. OpenCV filters are SIMD-accelerated (most of them) for x86 architectures. For now, there is not much NEON-enabled code (SIMD technology for ARM); however, I had very good results (with filter2D) by translating SSE instructions into NEON. Read the bottom part of this post for more details: http://answers.opencv.org/question/755/object-detection-slow/#760
Reduce the number of filter2D calls. You said you already did it, but never say never. Probably you use them to extract some features - then you can try to decide earlier whether a part of the image contains the needed features, and, if not, skip filtering. Or you apply a bank of filters at different angles - then 256 steps is too much - 8 or 16 steps is ~~the best you~~ usually ~~get from an image. (You can make a~~ more ~~detailed description of your algorithm here, in order to get specific ideas). This is the most efficient way to speed it up.~~than enough.
Find another algorithm. Haar-based detector, by Viola & Jones, is a great tool in image processing. And quite accurate. The same is true for HOG (Histogram of Oriented Gradients). However, both of them are superseded in terms of popularity by LBP (Local Binary Patterns), an algorithms that is less accurate, but much faster. Both HOG and Haar use floating-point calculations. LBP only does bit twindling. People prefer LBP. You can find a different way to solve your problem - maybe with the same accuracy, but without those costly calls. The best approach here is to read a lot - ~~leard~~ learn what others have done in your field, and even in ~~differend~~ different CV areas - and get ideas.

Post a description on your algorithm here, together with images/graphics, etc. We may have specific ideas or advice on your work. You are probably not the first one to have this problem.

There are a number of tricks to speed up filtering. However, none of them is an off-the-shelf solution.

Separable filters. Some 2D filters ~~has~~ have the mathematical property that there are two one-dimensional filters that, applied consecutively, have the same effect on the image as the original 2D one. By example, gaussian blur, with a window size of 9x9 (81 elements, so ~81 muliplications per each pixel) can be reduced to a vertical filter of 9x1 and a horizontal filter of 1x9. That means 18 muliplications for each pixel. This applies to most symmetrical filters. However, it does not apply to more complex ones, like rotated Gabor kernels.
SIMD processing. OpenCV filters are SIMD-accelerated (most of them) for x86 architectures. For now, there is not much NEON-enabled code (SIMD technology for ARM); however, I had very good results (with filter2D) by translating SSE instructions into NEON. Read the bottom part of this post for more details: http://answers.opencv.org/question/755/object-detection-slow/#760
Reduce the number of filter2D calls. You said you already did it, but never say never. Probably you use them to extract some features - then you can ~~try to~~ decide earlier whether a part of the image contains the needed features, and, if not, skip filtering. Or you apply a bank of filters at different angles - then 256 steps is too much - 8 or 16 steps is usually more than enough.
Find another algorithm. Haar-based detector, by Viola & Jones, is a great tool in image processing. And quite accurate. The same is true for HOG (Histogram of Oriented Gradients). However, both of them are superseded in terms of popularity by LBP (Local Binary Patterns), an algorithms that is less accurate, but much faster. Both HOG and Haar use floating-point calculations. LBP only does bit twindling. ~~People~~ And people prefer LBP. You can find a different way to solve your problem - maybe with the same accuracy, but without those costly calls. The best approach here is to read a lot - learn what others have done in your field, and even in different CV computer vision areas - and get ideas.
Post a description on your algorithm here, together with images/graphics, etc. We may have specific ideas or advice on your work. You are probably not the first one to have this problem.

There are a number of tricks to speed up filtering. However, none of them is an off-the-shelf solution.

Separable filters. Some 2D filters have the mathematical property that there are two one-dimensional filters that, applied consecutively, have the same effect on the image as the original 2D one. By example, gaussian blur, with a window size of 9x9 (81 elements, so ~81 muliplications per each pixel) can be reduced to a vertical filter of 9x1 and a horizontal filter of 1x9. That means 18 muliplications for each pixel. This applies to most symmetrical filters. However, it does not apply to more complex ones, like rotated Gabor kernels.
Convolve filters If you apply multiple filters on the same image consecutively, like a gaussian blur, then a Gabor filter, you can combine them together. Make all filters the same size and convolve them. Then apply the result on the image. Math says the effect will be identical with the previous combination
SIMD processing. OpenCV filters are SIMD-accelerated (most of them) for x86 architectures. For now, there is not much NEON-enabled code (SIMD technology for ARM); however, I had very good results (with filter2D) by translating SSE instructions into NEON. Read the bottom part of this post for more details: http://answers.opencv.org/question/755/object-detection-slow/#760
Reduce the number of filter2D calls. You said you already did it, but never say never. Probably you use them to extract some features - then you can decide earlier whether a part of the image contains the needed features, and, if not, skip filtering. Or you apply a bank of filters at different angles - then 256 steps is too much - 8 or 16 steps is usually more than enough.
Find another algorithm. Haar-based detector, by Viola & Jones, is a great tool in image processing. And quite accurate. The same is true for HOG (Histogram of Oriented Gradients). However, both of them are superseded in terms of popularity by LBP (Local Binary Patterns), an algorithms that is less accurate, but much faster. Both HOG and Haar use floating-point calculations. LBP only does bit twindling. And people prefer LBP. You can find a different way to solve your problem - maybe with the same accuracy, but without those costly calls. The best approach here is to read a lot - learn what others have done in your field, and even in different computer vision areas - and get ideas.
Post a description on your algorithm here, together with images/graphics, etc. We may have specific ideas or advice on your work. You are probably not the first one to have this problem.

There are a number of tricks to speed up filtering. However, none of them is an off-the-shelf solution.

Separable filters. Some 2D filters have the mathematical property that there are two one-dimensional filters that, applied consecutively, have the same effect on the image as the original 2D one. By example, gaussian blur, with a window size of 9x9 (81 elements, so ~81 muliplications per each pixel) can be reduced to a vertical filter of 9x1 and a horizontal filter of 1x9. That means 18 muliplications for each pixel. This applies to most symmetrical filters. However, it does not apply to more complex ones, like rotated Gabor kernels.
Convolve filters If you apply multiple filters on the same image consecutively, like a gaussian blur, then a Gabor filter, you can combine them together. Make all filters the same size and convolve them. Then apply the result on the image. Math says the effect will be identical with the previous combination
Fast fourier transform If your filters are really big (e.g. 30x30 or bigger) you can apply FFT on the image and the kernel, than use the nice property of FFT to transform convolution into addition. Check some math textbooks for more details.
SIMD processing. OpenCV filters are SIMD-accelerated (most of them) for x86 architectures. For now, there is not much NEON-enabled code (SIMD technology for ARM); however, I had very good results (with filter2D) by translating SSE instructions into NEON. Read the bottom part of this post for more details: http://answers.opencv.org/question/755/object-detection-slow/#760
Reduce the number of filter2D calls. You said you already did it, but never say never. Probably you use them to extract some features - then you can decide earlier whether a part of the image contains the needed features, and, if not, skip filtering. Or you apply a bank of filters at different angles - then 256 steps is too much - 8 or 16 steps is usually more than enough.
Find another algorithm. Haar-based detector, by Viola & Jones, is a great tool in image processing. And quite accurate. The same is true for HOG (Histogram of Oriented Gradients). However, both of them are superseded in terms of popularity by LBP (Local Binary Patterns), an algorithms that is less accurate, but much faster. Both HOG and Haar use floating-point calculations. LBP only does bit twindling. And people prefer LBP. You can find a different way to solve your problem - maybe with the same accuracy, but without those costly calls. The best approach here is to read a lot - learn what others have done in your field, and even in different computer vision areas - and get ideas.
Post a description on your algorithm here, together with images/graphics, etc. We may have specific ideas or advice on your work. You are probably not the first one to have this problem.