Custom pixelwise alpha compositing: is this correct?

asked 2014-10-18 04:41:20 -0500

OpenCV as I know, does not offer pixelwise add() but only addWeighted() that applies one scalar to all pixels. Using the C-style array access that is the fastest among all means of pixel access, my custom alpha compositing function is still slow as hell - it took nearly 2 seconds of operation for a 1400x900 image. I don't think building in release mode helps optimization... Is there a way to increase the speed?

This can be a more C++ related question but anyway...

I'm writing alphaCompositeLayers() - an alpha compositing function that multiplies each pixel of the background cv::Mat by the alpha value of the corresponding pixel of the foreground cv::Mat. Both cv::Mats are CV_8UC4 based (unsigned char, 4 channels):

// mat1 in foreground, mat0 in background
cv::Mat alphaCompositeLayers(cv::Mat mat0, cv::Mat mat1) {
    cv::Mat res = mat0.clone();

    int nRows = res.rows;
    int nCols = res.cols * res.channels();
    if (res.isContinuous()) {
        nCols *= nRows;
        nRows = 1;
    }

    for (int u = 0; u < nRows; u++) {
        unsigned char *resrgb = res.ptr<unsigned char>(u);
        unsigned char *matrgb = mat1.ptr<unsigned char>(u);
        for (int v = 0; v < nCols; v += 4) {
            unsigned char newalpha = cv::saturate_cast<unsigned char>(resrgb[v + 3] * (255.0f - matrgb[v + 3]) + matrgb[v + 3]);
            resrgb[v] = cv::saturate_cast<unsigned char>((resrgb[v] * resrgb[v + 3] / 255.0f * (255 - matrgb[v + 3]) / 255.0f + matrgb[v] * matrgb[v + 3] / 255.0f)); // / newalpha);
            resrgb[v + 1] = cv::saturate_cast<unsigned char>((resrgb[v + 1] * resrgb[v + 3] / 255.0f * (255 - matrgb[v + 3]) / 255.0f + matrgb[v + 1] * matrgb[v + 3] / 255.0f)); // / newalpha);
            resrgb[v + 2] = cv::saturate_cast<unsigned char>((resrgb[v + 2] * resrgb[v + 3] / 255.0f * (255 - matrgb[v + 3]) / 255.0f + matrgb[v + 2] * matrgb[v + 3] / 255.0f)); // / newalpha);
            resrgb[v + 3] = newalpha;
            resrgb[v + 3] = cv::saturate_cast<unsigned char>(rand() % 256);
        }
    }

    return res;
}

Here's another function multiplyLayerByAlpha() that multiplies each pixel by its alpha value (0% opacity = black, 100% opacity = pixel color):

cv::Mat multiplyLayerByAlpha(cv::Mat mat) {
    cv::Mat res = mat.clone();

    int nRows = res.rows;
    int nCols = res.cols * res.channels();
    if (res.isContinuous()) {
        nCols *= nRows;
        nRows = 1;
    }

    for (int u = 0; u < nRows; u++) {
        unsigned char *resrgb = res.ptr<unsigned char>(u);
        for (int v = 0; v < nCols; v += 4) {
            resrgb[v] = cv::saturate_cast<unsigned char>(resrgb[v] * resrgb[v + 3] / 255.0f);
            resrgb[v + 1] = cv::saturate_cast<unsigned char>(resrgb[v + 1] * resrgb[v + 3] / 255.0f);
            resrgb[v + 2] = cv::saturate_cast<unsigned char>(resrgb[v + 2] * resrgb[v + 3] / 255.0f);
        }
    }

    return res;
}

An array of cv::Mats, for example {mat0, mat1, mat2} with mat2 on foremost (on top of all 3), I basically run this:

cv::Mat resultingCvMat = multiplyLayerByAlpha(
                                              alphaCompositeLayers(
                                                                   mat0,
                                                                   alphaCompositeLayers(mat1, mat2)
                                                                   )
                                              );

How can I make the program compute the resultingCvMat faster? With C++ ways like multi-threading (then how)? Or with OpenCV functions and ways (again, then how)?

Thanks!

edit retag flag offensive close merge delete