I've implemented a software for searching a pattern inside an image. With cvMatchTemplate the execution time is around 10ms (because I'm taking a pattern of 40x40 in a search window of 120x160 pixels. The image is 640x480 so I'm not considering the whole image).
I've implemented the same algorithm by using the gpu::MatchTemplate, and I was expecting improvements for the execution time. I'm afraid that is performing the FFT of the images to compute the cross-correlation because is taking 220ms to compute the score. (The method is CV_TM_CCOEFF_NORMED). Is there a way to force this function to use the time domain approach?
Or if this is not the problem that i guess, what is it?
Thanks.