Ask Your Question

Revision history [back]

Speeding up the computation of SSD of 3x3 patches

Hi,

As part of a bigger application, I need to compute the following code:

    ax2 += (int)(25 + 0.5);
    ay2 += (int)(25 + 0.5);

    bx2 += (int)(25 + 0.5);
    by2 += (int)(25 + 0.5);

    cx2 += (int)(25 + 0.5);
    cy2 += (int)(25 + 0.5);

        for (int ix = -1; ix <= 1; ix++){
            for (int iy = -1; iy <= 1; iy++){
                suma += (grayImage.at<uchar>(ay2 + iy, ax2 + ix) - grayImage.at<uchar>(by2 + iy, bx2 + ix))* grayImage.at<uchar>(ay2 + iy, ax2 + ix) - grayImage.at<uchar>(by2 + iy, bx2 + ix));
    }
    }

It basically computes the sum of squared difference of two 3X3 patches.

It runs extremely slow. Is there any way of speeding it up?

Thanks,

Gil.

P.S. it's part of an algorithm that I intend to contribute to OpenCV, once it will be published.

Speeding up the computation of SSD of 3x3 patches

Hi,

As part of a bigger application, I need to compute the following code:

    ax2 += (int)(25 + 0.5);
    ay2 += (int)(25 + 0.5);

    bx2 += (int)(25 + 0.5);
    by2 += (int)(25 + 0.5);

    cx2 += (int)(25 + 0.5);
    cy2 += (int)(25 + 0.5);

        for (int ix = -1; ix <= 1; ix++){
            for (int iy = -1; iy <= 1; iy++){
                suma += (grayImage.at<uchar>(ay2 + iy, ax2 + ix) - grayImage.at<uchar>(by2 + iy, bx2 + ix))* grayImage.at<uchar>(ay2 + iy, ax2 + ix) - grayImage.at<uchar>(by2 + iy, bx2 + ix));
    }
    }

It basically computes the sum of squared difference of two 3X3 patches.

It runs extremely slow. Is there any way of speeding it up?

EDIT:

I changed to the following version:

for (int ix = -1; ix <= 1; ix++){
        for (int iy = -1; iy <= 1; iy++){
            double difa = grayImage.at<uchar>(ay2 + iy, ax2 + ix) - grayImage.at<uchar>(by2 + iy, bx2 + ix);
            suma += (difa)*(difa);
        }
}

And it runs faster, but is there any way to improve it further?

Thanks,

Gil.

P.S. it's part of an algorithm that I intend to contribute to OpenCV, once it will be published.

Speeding up the computation of SSD of 3x3 patches

Hi,

As part of a bigger application, I need to compute the following code:

    ax2 += (int)(25 + 0.5);
    ay2 += (int)(25 + 0.5);

    bx2 += (int)(25 + 0.5);
    by2 += (int)(25 + 0.5);

    cx2 += (int)(25 + 0.5);
    cy2 += (int)(25 + 0.5);

        for (int ix = -1; ix <= 1; ix++){
            for (int iy = -1; iy <= 1; iy++){
                suma += (grayImage.at<uchar>(ay2 + iy, ax2 + ix) - grayImage.at<uchar>(by2 + iy, bx2 + ix))* grayImage.at<uchar>(ay2 + iy, ax2 + ix) - grayImage.at<uchar>(by2 + iy, bx2 + ix));
    }
    }

It basically computes the sum of squared difference of two 3X3 patches.

It runs extremely slow. Is there any way of speeding it up?

EDIT:

I changed to the following version:

for (int ix = -1; ix <= 1; ix++){
        for (int iy = -1; iy <= 1; iy++){
            double difa = grayImage.at<uchar>(ay2 + iy, ax2 + ix) - grayImage.at<uchar>(by2 + iy, bx2 + ix);
            suma += (difa)*(difa);
        }
}

And it runs faster, but is there any way to improve it further?

EDIT: Thanks for the answer, I'm not using the following code:

//int iy = -1;
     Mi_a = grayImage.ptr<uchar>(ay2 - 1);
     Mi_b = grayImage.ptr<uchar>(by2 - 1);
     Mi_c = grayImage.ptr<uchar>(cy2 - 1);

    difa = Mi_a[ax2 - 1] - Mi_b[bx2 - 1];
    suma += (difa)*(difa);
    difc = Mi_c[cx2 - 1] - Mi_b[bx2 - 1];
    sumc += (difc)*(difc);
    difa = Mi_a[ax2 + 0] - Mi_b[bx2 + 0];
    suma += (difa)*(difa);
    difc = Mi_c[cx2 + 0] - Mi_b[bx2 + 0];
    sumc += (difc)*(difc);
    difa = Mi_a[ax2 + 1] - Mi_b[bx2 + 1];
    suma += (difa)*(difa);
    difc = Mi_c[cx2 + 1] - Mi_b[bx2 + 1];
    sumc += (difc)*(difc);



    //int iy=0;
    Mi_a = grayImage.ptr<uchar>(ay2 + 0);
    Mi_b = grayImage.ptr<uchar>(by2 + 0);
    Mi_c = grayImage.ptr<uchar>(cy2 + 0);

    difa = Mi_a[ax2 - 1] - Mi_b[bx2 - 1];
    suma += (difa)*(difa);
    difc = Mi_c[cx2 - 1] - Mi_b[bx2 - 1];
    sumc += (difc)*(difc);
    difa = Mi_a[ax2 + 0] - Mi_b[bx2 + 0];
    suma += (difa)*(difa);
    difc = Mi_c[cx2 + 0] - Mi_b[bx2 + 0];
    sumc += (difc)*(difc);
    difa = Mi_a[ax2 + 1] - Mi_b[bx2 + 1];
    suma += (difa)*(difa);
    difc = Mi_c[cx2 + 1] - Mi_b[bx2 + 1];
    sumc += (difc)*(difc);


    //int iy=1
    Mi_a = grayImage.ptr<uchar>(ay2 + 1);
    Mi_b = grayImage.ptr<uchar>(by2 + 1);
    Mi_c = grayImage.ptr<uchar>(cy2 + 1);

    difa = Mi_a[ax2 - 1] - Mi_b[bx2 - 1];
    suma += (difa)*(difa);
    difc = Mi_c[cx2 - 1] - Mi_b[bx2 - 1];
    sumc += (difc)*(difc);
    difa = Mi_a[ax2 + 0] - Mi_b[bx2 + 0];
    suma += (difa)*(difa);
    difc = Mi_c[cx2 + 0] - Mi_b[bx2 + 0];
    sumc += (difc)*(difc);
    difa = Mi_a[ax2 + 1] - Mi_b[bx2 + 1];
    suma += (difa)*(difa);
    difc = Mi_c[cx2 + 1] - Mi_b[bx2 + 1];
    sumc += (difc)*(difc);

Is there any way to speed it up even further?

Thanks,

Gil.

P.S. it's part of an algorithm that I intend to contribute to OpenCV, once it will be published.