1 | initial version |
If your convolution is separable you can use external parallelization like in my answer here. When using sepFilter2D you can create kernels in constructor once.
The problem with filter2D is that FFT convolution is used in case of kernel sizes >11 (see here and parallelization like above will not work.
As far as I know IPP isn't multithreaded at all and OpenCVs implementation isn't, too.