Ask Your Question

Revision history [back]

click to hide/show revision 1
initial version

Normalization of really small numbers

I came across this problem today while calculating Hu invariants for some digits. When the input image was NOT treated as binary, the moments were very small, often much smaller than DBL_EPSILON (which will be important later!). The calculated invariants for each digit filled a row in a Mat. As the result, I obtained a matrix of Hu invariants (column wise) for my digits (row wise). Then I wanted to normalize the invariants column by column to 0-100 range with:

normalize(A.col(c), B.col(c), 0, 100, NORM_MINMAX);

What I noticed was that most of my 7 columns were normalized properly, but 2 of them were filled with zeros after normalization. That was not right, so I normalized my matrix manually:

for (int c = 0; c < 3; c++)
    { 
        double minV, maxV;
        minMaxIdx(A.col(c), &minV, &maxV);
        for (int r = 0; r < 3; r++)
            C.at<double>(r, c) = 100 * (A.at<double>(r, c) - minV) / (maxV - minV);
    }

and the result was as expected.

I had a look inside the 'normalize' method and noticed this:

 scale = (dmax - dmin)*(smax - smin > DBL_EPSILON ? 1./(smax - smin) : 0);

which means, that if elements to be normalized are spread over a vary small range, they will not be normalized, but set to zero (or the low end of the requested range). I understand, that this is to prevent numerical errors, but I am not sure if this should be done this way when we are dealing with numbers which are ALL very small?

So I performed a small test. Here is my input matrix:

A
 10 5e-016  5e-026
  1 5e-020  5e-027
-10 1e-030 -5e-027

The result of OpenCV normalization:

B
100   100  0
55    0.01 0
0     0    0

And my manual result:

C
100  100   100
55  0.01   18.1818
0     0      0

Here is the code I used for the above:

Mat A = (Mat_<double>(3, 3) << 10, 5e-16, 5e-26, 1, 5e-20, 5e-27, -10, 1e-30, -5e-27);
Mat B(3, 3, CV_64F);
for (int c = 0; c < 3; c++)
    normalize(A.col(c), B.col(c), 0, 100, NORM_MINMAX);

Mat C(3, 3, CV_64F);
for (int c = 0; c < 3; c++)
    { 
    double minV, maxV;
    minMaxIdx(A.col(c), &minV, &maxV);
    for (int r = 0; r < 3; r++)
        C.at<double>(r, c) = 100 * (A.at<double>(r, c) - minV) / (maxV - minV);
    }



    cout << "A" << endl;
    for (int r = 0; r < 3; r++)
    {
        for (int c = 0; c < 3; c++)
            cout << A.at<double>(r, c) << " ";
        cout << endl;
    }
    cout << endl << "B" <<endl;
    for (int r = 0; r < 3; r++)
    {
        for (int c = 0; c < 3; c++)
            cout << B.at<double>(r, c) << " ";
        cout << endl;
    }
    cout << endl << "C" << endl;
    for (int r = 0; r < 3; r++)
    {
        for (int c = 0; c < 3; c++)
            cout << C.at<double>(r, c) << " ";
        cout << endl;
    }

I think it has no important meaning in classification with Hu invariants, as it is quite unlikely to get invariants that differ by less than DBL_EPSILON for different objects (mine were for '0' digit only) ao the problem would normall not occur. And even so, it means that this particular feature is simply usless. By the way, when input image for calculating moments was treated as binary one, Hu moments were bigger than DBL_EPSILON and the problem disappeared. However, in some other applications this could have some influence. When someone is dealing with very small numbers and tries to normalize them, there might be a problem. What are your thoughts on this?

Normalization of really small numbers

I came across this problem today while calculating Hu invariants for some digits. When the input image was NOT treated as binary, the moments were very small, often much smaller than DBL_EPSILON (which will be important later!). The calculated invariants for each digit filled a row in a Mat. As the result, I obtained a matrix of Hu invariants (column wise) for my digits (row wise). Then I wanted to normalize the invariants column by column to 0-100 range with:

normalize(A.col(c), B.col(c), 0, 100, NORM_MINMAX);

What I noticed was that most of my 7 columns were normalized properly, but 2 of them were filled with zeros after normalization. That was not right, so I normalized my matrix manually:

for (int c = 0; c < 3; c++)
    { 
        double minV, maxV;
        minMaxIdx(A.col(c), &minV, &maxV);
        for (int r = 0; r < 3; r++)
            C.at<double>(r, c) = 100 * (A.at<double>(r, c) - minV) / (maxV - minV);
    }

and the result was as expected.

I had a look inside the 'normalize' method and noticed this:

 scale = (dmax - dmin)*(smax - smin > DBL_EPSILON ? 1./(smax - smin) : 0);

which means, that if elements to be normalized are spread over a vary small range, they will not be normalized, but set to zero (or the low end of the requested range). I understand, that this is to prevent numerical errors, but I am not sure if this should be done this way when we are dealing with numbers which are ALL very small?

So I performed a small test. Here is my input matrix:

A
 10 5e-016  5e-026
  1 5e-020  5e-027
-10 1e-030 -5e-027

The result of OpenCV normalization:

B
100   100  0
55    0.01 0
0     0    0

And my manual result:

C
100  100   100
55  0.01   18.1818
0     0      0

Here is the code I used for the above:

Mat A = (Mat_<double>(3, 3) << 10, 5e-16, 5e-26, 1, 5e-20, 5e-27, -10, 1e-30, -5e-27);
Mat B(3, 3, CV_64F);
for (int c = 0; c < 3; c++)
    normalize(A.col(c), B.col(c), 0, 100, NORM_MINMAX);

Mat C(3, 3, CV_64F);
for (int c = 0; c < 3; c++)
    { 
    double minV, maxV;
    minMaxIdx(A.col(c), &minV, &maxV);
    for (int r = 0; r < 3; r++)
        C.at<double>(r, c) = 100 * (A.at<double>(r, c) - minV) / (maxV - minV);
    }



    cout << "A" << endl;
    for (int r = 0; r < 3; r++)
    {
        for (int c = 0; c < 3; c++)
            cout << A.at<double>(r, c) << " ";
        cout << endl;
    }
    cout << endl << "B" <<endl;
    for (int r = 0; r < 3; r++)
    {
        for (int c = 0; c < 3; c++)
            cout << B.at<double>(r, c) << " ";
        cout << endl;
    }
    cout << endl << "C" << endl;
    for (int r = 0; r < 3; r++)
    {
        for (int c = 0; c < 3; c++)
            cout << C.at<double>(r, c) << " ";
        cout << endl;
    }

I think it has no important meaning in classification with Hu invariants, as it is quite unlikely to get invariants that differ by less than DBL_EPSILON for different objects (mine were for '0' digit only) ao and the problem would normall normally not occur. And even so, it means that this particular feature is simply usless. By the way, when input image for calculating moments was treated as binary one, Hu moments were bigger than DBL_EPSILON and the problem disappeared. However, in some other applications this could have some influence. When someone is dealing with very small numbers and tries to normalize them, there might be a problem. What are your thoughts on this?

Normalization of really small numbers

I came across this problem today while calculating Hu invariants for some digits. When the input image was NOT treated as binary, the moments were very small, often much smaller than DBL_EPSILON (which will be important later!). The calculated invariants for each digit filled a row in a Mat. As the result, I obtained a matrix of Hu invariants (column wise) for my digits (row wise). Then I wanted to normalize the invariants column by column to 0-100 range with:

normalize(A.col(c), B.col(c), 0, 100, NORM_MINMAX);

What I noticed was that most of my 7 columns were normalized properly, but 2 of them were filled with zeros after normalization. That was not right, so I normalized my matrix manually:

for (int c = 0; c < 3; c++)
    { 
        double minV, maxV;
        minMaxIdx(A.col(c), &minV, &maxV);
        for (int r = 0; r < 3; r++)
            C.at<double>(r, c) = 100 * (A.at<double>(r, c) - minV) / (maxV - minV);
    }

and the result was as expected.

I had a look inside the 'normalize' method and noticed this:

 scale = (dmax - dmin)*(smax - smin > DBL_EPSILON ? 1./(smax - smin) : 0);

which means, that if elements to be normalized are spread over a vary small range, they will not be normalized, but set to zero (or the low end of the requested range). I understand, that this is to prevent numerical errors, but I am not sure if this should be done this way when we are dealing with numbers which are ALL very small?

So I performed a small test. Here is my input matrix:

A
 10 5e-016  5e-026
  1 5e-020  5e-027
-10 1e-030 -5e-027

The result of OpenCV normalization:

B
100   100  0
55    0.01 0
0     0    0

And my manual result:

C
100  100   100
55  0.01   18.1818
0     0      0

Here is the code I used for the above:

Mat A = (Mat_<double>(3, 3) << 10, 5e-16, 5e-26, 1, 5e-20, 5e-27, -10, 1e-30, -5e-27);
Mat B(3, 3, CV_64F);
for (int c = 0; c < 3; c++)
    normalize(A.col(c), B.col(c), 0, 100, NORM_MINMAX);

Mat C(3, 3, CV_64F);
for (int c = 0; c < 3; c++)
    { 
    double minV, maxV;
    minMaxIdx(A.col(c), &minV, &maxV);
    for (int r = 0; r < 3; r++)
        C.at<double>(r, c) = 100 * (A.at<double>(r, c) - minV) / (maxV - minV);
    }
 
    cout << "A" << endl;
    for (int r = 0; r < 3; r++)
    {
        for (int c = 0; c < 3; c++)
            cout << A.at<double>(r, c) << " ";
        cout << endl;
    }
    cout << endl << "B" <<endl;
    for (int r = 0; r < 3; r++)
    {
        for (int c = 0; c < 3; c++)
            cout << B.at<double>(r, c) << " ";
        cout << endl;
    }
    cout << endl << "C" << endl;
    for (int r = 0; r < 3; r++)
    {
        for (int c = 0; c < 3; c++)
            cout << C.at<double>(r, c) << " ";
        cout << endl;
    }

I think it has no important meaning in classification with Hu invariants, as it is quite unlikely to get invariants that differ by less than DBL_EPSILON for different objects (mine were for '0' digit only) and the problem would normally not occur. And even so, it means that this particular feature is simply usless. By the way, when input image for calculating moments was treated as binary one, Hu moments were bigger than DBL_EPSILON and the problem disappeared. However, in some other applications this could have some influence. When someone is dealing with very small numbers and tries to normalize them, there might be a problem. What are your thoughts on this?