Ask Your Question

# Select only gray pixels in color image

Hi

To aid my OCR component in recognizinig text I'd like to binary threshold my image first.

I'd like to try a new method of thresholding where I do not only define a threshold value, but als define that the R-G-B component values must be very close to each other.

This way I hope to separate dark-grey text from dark-blue background, where the pixels fall into the same intensity range, but a human could easily distinguish them because of their color.

Example:

• RGB(9, 9, 9) => becomes black
• RGB(1, 1, 10) => becomes white

Now I can figure out how to iterate over every pixel and do just that. The question is, do you know if this type of thresholding algorithm is already implemented or if there already exists a name for it?

Thank you very much!

edit retag close merge delete

## 3 answers

Sort by » oldest newest most voted

It can be done and here are my results. This routine is a combination of thresholding and my custom color-component dependant threshold (Java-Code):

## Code

// initialization
Mat colorsub = ...
byte[] data = new byte[4];
byte[] black = new byte[] { 0, 0, 0, -1 };
byte[] white = new byte[] { -1, -1, -1, -1 };

// iterate over the image
for (int y = 0; y < colorsub.height(); y++) {
for (int x = 0; x < colorsub.width(); x++) {

// extract color component values as unsigned integers (byte is a signed type in java)
colorsub.get(y, x, data);
int r = data[0] & 0xff;
int g = data[1] & 0xff;
int b = data[2] & 0xff;

// do simple threshold first
int thresh = 100;
if (r > thresh || g > thresh || b > thresh)
{
colorsub.put(y, x, white);
continue;
}

// adjust the blue component
b = (int)(b * 1.3);

// quantification of color component's values distribution
int mean = (r + g + b) / 3;
int diffr = Math.abs(mean - r);
int diffg = Math.abs(mean - g);
int diffb = Math.abs(mean - b);

int maxdev = 80;

if ((diffr + diffg + diffb) > maxdev)
colorsub.put(y, x, white);
else
colorsub.put(y, x, black);
}
}

1. On my camera I have noticed that in darkgray text the blue channel is too low so I amateurishly increase it a little. This could be improved by a real histogram correction.

2. Also, the first threshold operation is not really based on pixel intensity, but I think the effect is negligible for this demonstration.

## Results

• First image is the regular image caught by the classifier, in a green border.
• Second image is the grayscale image by OpenCV, for reference
• Third image is OpenCVs binary threshold
• Fourth image is the output from my custom threshold function above.

On a regular desk. This doesn't look too bad:

On a sofa in low light condition. Here the threshold performs better. I had to correct the values for my custom threshold after this one:

On a stack of paper, reduced artifacts:

On the kitchen bar. Since the metal is gray, wen cannot filter it out, obviously:

I think this is the most interesting image. Much less and smaller segments:

## Conclusion

As with every thresholding algorithm, fine tuning is paramount. Given a thresholded image with finely tuned parameters, a color-coded threshold can still further improve the picture.

It looks to be useful to remove inlined emblems and pictures from the text, e.g. smileys or colored bullet points.

Maybe the information in this post can be of use for someone else.

more

I'm not an expert in image segmentation techniques, but due to noboby has answered it, here are my two cents:

One simple approach is to perform an euclidean distance segmentation between the points. For example: RGB(10, 10, 10) is closer from RGB(9, 9, 9) than RGB(1, 1, 10) But one key point to take into account is the correlation in your variables. gray pixels are correlated because a gray value has all the RGB components very similar. That is why the mahalanobis distance is often used in image segmentation. mahalanobis distance differs from Euclidean distance in that it takes into account the correlations of the data set.

In order to use the Mahalanobis distance to classify a test point as belonging to one of N classes, one first estimates the covariance matrix of each class, usually based on samples known to belong to each class. Then, given a test sample, one computes the Mahalanobis distance to each class, and classifies the test point as belonging to that class for which the Mahalanobis distance is minimal.

This is an example that performs image segmentation using the mahalanobis distance:

mahalanobis distance segmentation

First, you use the mouse to select pixels which act as the "training set" --> build the covariance matrix. Then, the mahalanobis distance is used to segment your images.

The mahalanobis distance is also used in background substraction (discriminate between foreground and background pixels by building and maintaining a model of the background). You could also try to build a model of the background (dark-blue background) and try to segment the foreground (dark-grey text).

more

Why not try with threshold of every channel and then compute the result ?

more

## Comments

I have looked at per-channel thresholding before but unfortunately it's not the same thing I'm trying to achieve.

( 2013-11-26 09:17:57 -0500 )edit

Official site

GitHub

Wiki

Documentation

## Stats

Asked: 2013-11-26 06:08:27 -0500

Seen: 7,376 times

Last updated: Nov 28 '13