cv::Vec<...,...> vs direct access performance for multi-channel matrices
I'm currently trying to reduce the overhead cost of accessing cv::Mat elements in semi-critical code; when manipulating CV_8UC1 (grayscale) images, I can directly access the element I want by using one of the following lines:
uchar val = img.at<uchar>(row,col);
or
uchar val = img.data[img.step.p[0]*row + col];
So far, all is well, performance is good. These two lines are actually identical, as the .at<...> function is actually an inlined data access. The problem comes up when trying to access elements in multi-channel matrices: the following line, unlike what I assumed, crashes at run-time, since the matrix is still considered 2-dimensions.
uchar val = img.at<uchar>(row,col,cn) (DOES NOT WORK)
Looking around for an 'official' solution revealed that using the following lines was the most common way to go:
const Vec3b vec = img.at<Vec3b>(row,col);
uchar val = vec[cn];
or
uchar val = img.at<Vec3b>(row,col)[cn];
Thing is, going through the Vec<...,...> structure to access a single channel value is extremely time-consuming: in fact, some quick profiling showed that it was at least 10 times slower than a direct 'data' access.
Am I wrong in assuming this is the most common solution? The performance hit is actually quite important, and falling back to a manual data access (by guessing the underlying data structure) is actually a major improvement:
uchar val = img.data[img.step.p[0]*row + img.step.p[1]*col + cn];
Is there any reason why the Vec<...,...> approach does not offer a better performance, or why the .at<...> function doesn't support multi-channel access?