Ask Your Question

Revision history [back]

click to hide/show revision 1
initial version

Filling gaps in letters using cv2

I have an image file with text which I want to extract using OCR. But it has a diagonal overlapping line of text over it (top right), like https://i.imgur.com/kgl3vXV.png. I remove this line using,

  image = cv2.imread(image_path)

  image = cv2.resize(image, None, fx=2, fy=2, interpolation=cv2.INTER_CUBIC)

  image = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)

  image = cv2.GaussianBlur(image, (5, 5), 0)

  image = cv2.threshold(image, 100, 255, cv2.THRESH_BINARY)[1] # 100 here as the diagonal line is grey

This results in an image like, https://i.imgur.com/ok3T7W8.png.

Notice the thick characters for shear stress, it is one of the regions where the diagonal line overlapped. Now I apply OCR. However, the previous steps remove some pixels. For instance, the e in edge dislocation is not complete.

This results in poor results like, "edve dislocation". I tried erosion and dilation but with no significant improvement.

Is there any way to fill up the holes in characters?

Is there any way to reduce thickness of the characters which overlap with the diagonal line?

Filling gaps in letters using cv2

I have an image file with text which I want to extract using OCR. But it has a diagonal overlapping line of text over it (top right), like https://i.imgur.com/kgl3vXV.pngthis. I remove this line using,

  image = cv2.imread(image_path)

  image = cv2.resize(image, None, fx=2, fy=2, interpolation=cv2.INTER_CUBIC)

  image = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)

  image = cv2.GaussianBlur(image, (5, 5), 0)

  image = cv2.threshold(image, 100, 255, cv2.THRESH_BINARY)[1] # 100 here as the diagonal line is grey

This results in an image like, https://i.imgur.com/ok3T7W8.pngthis.

Notice the thick characters for shear stress, it is one of the regions where the diagonal line overlapped. Now I apply OCR. However, the previous steps remove some pixels. For instance, the e in edge dislocation is not complete.

This results in poor results like, "edve dislocation". I tried erosion and dilation but with no significant improvement.

Is there any way to fill up the holes in characters?

Is there any way to reduce thickness of the characters which overlap with the diagonal line?