efficiency of Filter2D and sepFilter2D
Theoretically, the sepFilter is faster than Filter. But, the results in my tests show that the Filter2D is about 4 times faster than the sepFilter2D. I tested the two functions in CPU,with the same kernel radius size.
I think this is an interesting issue. Could you provide some basic code to reproduce the benchmark results?
I think that it depends on an internal optimization that might exist with one of the functions. However, for that you will need to dig into the source code and clarify it, but most likely my guess is that this is the case.
how large is your kernel ? (i.e. filter2d uses dct for k>11)
The radius of the kernel is 3 for both cases. And I've tested both CPU and GPU.