Tilted integral image + other general considerations on integral image
Note: While investigating it finally ended up that this is not really a question, but I think it might be useful to share my findings :)
I wanted to compute the integral image using the function library, in order to implement fast Gaussian filters according to the work published here: http://www.csse.uwa.edu.au/~pk/research/matlabfns/#integral, and also other filters.
I have seen, though, that there's no way I can avoid computing the squared sum if I want to compute the integral image for the tilted image. http://docs.opencv.org/modules/imgproc/doc/miscellaneous_transformations.html#integral
The rationale of this choice is that the integral image is basically only used in the Haar detection, and that there the sum of squared pixel is used to perform "Fast lighting correction". However this means that if you don't need the squared integral image you have to come up with your own implementation.
BTW, I went to see the implementation on the trunk (integral_ in sumpixels.cpp), and it seems there's no tbb implementation of the integral image computation, while the function could be parallelized to some extent.
Another interesting finding is that integral image might overflow for image sizes >8Mb: taking a completely white image, the overflow happens when 2^8 * W * H > 2^31 (not 32, as one bit is used for keeping the sign, as the integral image could be computed for signed images), that is W * H > 2^23 = 8 * 2^10 * 2^10