Ask Your Question

Revision history [back]

click to hide/show revision 1
initial version

I can explain CV_TM_CORR, but am myself still looking for a good explanation of CV_TM_CCOEFF.

CV_TM_CORR is a cross-correlation, an image/signal processing technique that relies on multiplication. The inputs to a cross-correlation are a smaller sample (with image processing, that's your template image) and a larger target dataset (image). Typically, a cross-correlation is used to find at what position (overlay) does the template/sample most closely match the data in the target image.

Put simply, cross-correlation of a template and an image involves three steps:

  1. Overlay the sample/template onto the target image.
  2. For each pixel position in the overlay, multiply the template image pixel value by the target image pixel value. Sum all the products together to get a "score" for the overlay.
  3. Repeat Step #2 for every possible overlay.

Typically, the overlay position with the highest score is the "winner", especially when using the normalized version of cross-correlation (CV_TM_CCORR_NORMED) - when the positive values line up with positive values and the negative values line up with negative values (which multiply to a positive) and all those positives are summed up, the score peaks, signifying a good alignment. Wikipedia probably does a better job explaining it than I do: Cross Correlation

Looking closer at the OpenCV CV_TM_CCORR equation:

R(x,y) is the cross-correlation score for a single overlay position (x, y).

T(x',y') is the image pixel value for a pixel (x',y') in the template/sample image.

I(x+x',y+y') is the image pixel value for the corresponding (based on the overlay) pixel position in the target image.

We sum up the product of T(x',y') and I(x+x',y+y') for each overlay pixel position - every possible (x', y') in the overlay - to get our score. Then we move to a new overlay (x,y) and repeat to get the other overlay scores.

(Also, please note the typos in the formula contained in the first edition of the O'Reilly book - there are some extraneous powers of two floating around, among other issues. I believe the formulas on the website are correct.)

Now, for CV_TM_CCOEFF.

It's the same basic framework, but with a different underlying calculation for each overlay. I don't understand the CV_TM_CCOEFF calculation. O'Reilly explains that "These methods match a template relative to its mean against the image relative to its mean, so a perfect match will be 1 and a perfect mismatch will be -1; a value of 0 simply means that there is no correlation". However, the equation given for CV_TM_CCOEFF doesn't subtract the mean from each pixel value but instead subtracts the reciprocal of the pixel value sum TIMES the number of pixels (shouldn't it be a division?). Plus, all the simple examples I work out on paper (with small, one dimensional signals) usually don't give me 1, 0, or -1. I also Googled Correlation Coefficient and found variations of this: Pearson Correlation Coefficient, which has all kinds of covariant and squared terms I can't reconcile with the OpenCV equation.