Custom pixelwise alpha compositing: is this correct?
OpenCV as I know, does not offer pixelwise add()
but only addWeighted()
that applies one scalar to all pixels. Using the C-style array access that is the fastest among all means of pixel access, my custom alpha compositing function is still slow as hell - it took nearly 2 seconds of operation for a 1400x900 image. I don't think building in release mode helps optimization... Is there a way to increase the speed?
This can be a more C++ related question but anyway...
I'm writing alphaCompositeLayers()
- an alpha compositing function that multiplies each pixel of the background cv::Mat
by the alpha value of the corresponding pixel of the foreground cv::Mat
. Both cv::Mat
s are CV_8UC4
based (unsigned char, 4 channels):
// mat1 in foreground, mat0 in background
cv::Mat alphaCompositeLayers(cv::Mat mat0, cv::Mat mat1) {
cv::Mat res = mat0.clone();
int nRows = res.rows;
int nCols = res.cols * res.channels();
if (res.isContinuous()) {
nCols *= nRows;
nRows = 1;
}
for (int u = 0; u < nRows; u++) {
unsigned char *resrgb = res.ptr<unsigned char>(u);
unsigned char *matrgb = mat1.ptr<unsigned char>(u);
for (int v = 0; v < nCols; v += 4) {
unsigned char newalpha = cv::saturate_cast<unsigned char>(resrgb[v + 3] * (255.0f - matrgb[v + 3]) + matrgb[v + 3]);
resrgb[v] = cv::saturate_cast<unsigned char>((resrgb[v] * resrgb[v + 3] / 255.0f * (255 - matrgb[v + 3]) / 255.0f + matrgb[v] * matrgb[v + 3] / 255.0f)); // / newalpha);
resrgb[v + 1] = cv::saturate_cast<unsigned char>((resrgb[v + 1] * resrgb[v + 3] / 255.0f * (255 - matrgb[v + 3]) / 255.0f + matrgb[v + 1] * matrgb[v + 3] / 255.0f)); // / newalpha);
resrgb[v + 2] = cv::saturate_cast<unsigned char>((resrgb[v + 2] * resrgb[v + 3] / 255.0f * (255 - matrgb[v + 3]) / 255.0f + matrgb[v + 2] * matrgb[v + 3] / 255.0f)); // / newalpha);
resrgb[v + 3] = newalpha;
resrgb[v + 3] = cv::saturate_cast<unsigned char>(rand() % 256);
}
}
return res;
}
Here's another function multiplyLayerByAlpha()
that multiplies each pixel by its alpha value (0% opacity = black, 100% opacity = pixel color):
cv::Mat multiplyLayerByAlpha(cv::Mat mat) {
cv::Mat res = mat.clone();
int nRows = res.rows;
int nCols = res.cols * res.channels();
if (res.isContinuous()) {
nCols *= nRows;
nRows = 1;
}
for (int u = 0; u < nRows; u++) {
unsigned char *resrgb = res.ptr<unsigned char>(u);
for (int v = 0; v < nCols; v += 4) {
resrgb[v] = cv::saturate_cast<unsigned char>(resrgb[v] * resrgb[v + 3] / 255.0f);
resrgb[v + 1] = cv::saturate_cast<unsigned char>(resrgb[v + 1] * resrgb[v + 3] / 255.0f);
resrgb[v + 2] = cv::saturate_cast<unsigned char>(resrgb[v + 2] * resrgb[v + 3] / 255.0f);
}
}
return res;
}
An array of cv::Mat
s, for example {mat0, mat1, mat2}
with mat2
on foremost (on top of all 3), I basically run this:
cv::Mat resultingCvMat = multiplyLayerByAlpha(
alphaCompositeLayers(
mat0,
alphaCompositeLayers(mat1, mat2)
)
);
How can I make the program compute the resultingCvMat
faster? With C++ ways like multi-threading (then how)? Or with OpenCV functions and ways (again, then how)?
Thanks!