How to remove borders from images taken from document (like MNIST handwritten chars)?

asked 2017-10-04 03:49:35 -0600

arka gravatar image

I want to extract handwritten characters that are written into boxes like this. Form field

I am extracting squares of a width of 29 pixels, which is giving me images like these.

Extracted images 1 Extracted Images 2 Extracted Images 3

To correctly recognize chars, the individual character images need to be extremely clean. Like this,

Clean chars 1 Clean chars 2

What am I doing is,

  1. Compute the horizontal and vertical projection of every image.
  2. Iterate through each element of both arrays. If the value of projection is greater than certain threshold, that mean it hasn't encountered the border. It removes whitespace around the border.

  3. Then find contours in the image.

  4. If the area of contour is greater than some threshold. Get the bounding rectangle and crop it.

But the problem is, the method is not that accurate. In some cases it works fine, but in most cases if fails miserably. It produces images like,

enter image description here enter image description here

Also projection values are very specific to this image (or images closer to this image). It doesn't generalize well.

Is there any other method that can work well for this situation?

The code,

char = cv2.imread(image)
char_gray = cv2.cvtColor(char, cv2.COLOR_BGR2GRAY)
char_bw = cv2.adaptiveThreshold(char_gray, 255, 
cv2.ADAPTIVE_THRESH_MEAN_C, cv2.THRESH_BINARY, 11, 9)

(rows, cols) = char_gray.shape

bit_not = cv2.bitwise_not(char_bw)
proj_h = cv2.reduce(bit_nv2.REDUCE_AVG)

proj_v = cv2.reduce(bit_not, 0, cv2.REDUCE_AVG)

thresh_h = 200
thresh_v = 100

start_x, start_y, end_x, end_y = 0, 0, cols - 1, rows - 1
#proj_h = proj_h[0]
proj_v = proj_v[0]

num_iter_h = cols // 8
num_iter_v = rows // 8

for _ in range(num_iter_h):
    if proj_h[start_y][0] > 35:
        start_y += 1

for _ in range(num_iter_h):
    if proj_h[end_y][0] > 160:
        end_y -= 1

for _ in range(num_iter_v):
    if proj_v[start_x] > 15: #25:
        start_x += 1

for _ in range(num_iter_v):
    if proj_v[end_x] > 125:
        end_x -= 1

print('processing.. %s.png' % idx)
output_char = char[start_y:end_y, start_x:end_x]
output_char = get_cropped_char(output_char)
return output_char


def get_cropped_char(img):
    """
    Returns Grayscale cropped image
    """

img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

blur = cv2.GaussianBlur(img, (3,3), 0)

thresh = cv2.adaptiveThreshold(blur, 255, cv2.ADAPTIVE_THRESH_MEAN_C, cv2.THRESH_BINARY, 75, 10)
im2, cnts, hierarchy = cv2.findContours(thresh, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)

contour = None
for c in cnts:
    area = cv2.contourArea(c)
    if area > 100:
        contour = c
if contour is None: return None
(x, y, w, h) = cv2.boundingRect(contour)
img = img[y:y+h, x:x+w]
return img
edit retag flag offensive close merge delete