1 | initial version |
First of all, feature descriptors like SIFT, SURF etc. are matched in general, not whole images. This means, a high matching score can be a sign that two images are equal, bus this is not a must.
Second, SIFT, SURF etc. are not fully affine invariant, that means "special transformations " like a perspective tilt for example can be a problem. But, in general, the extracted features are scale & rotation invariant.
In case you deal with large perspective distortions, maybe the ASIFT principle could be a solution for you. ASIFT follows a strategy which uses a traing phase based on simulated images to extract features under different viewing angles. So, a high degree of affine invariance can be achieved. The principle can be addapted to other feature descriptors like SURF, KAZE etc.