Ask Your Question

How to eliminate Text underline?

asked 2020-11-19 06:32:19 -0500

kst gravatar image

updated 2020-11-19 06:55:04 -0500

supra56 gravatar image

I have a scanned document which I will like to remove the underlined text before I run OCR through it. The reason why I need to remove the underline is because I noticed the accuracy of the OCR to recognize is bothered by the underlined.


For example, in the attached image, if I removed the underlines, the 2 dates can be recognized accurately else, one of the dates is not recognizable.

Any python sample code is much appreciated.


edit retag flag offensive close merge delete


What did you try? Did you bother to search for a solution?

Hint: the HoughLinesP function is what you are looking for. Here's a tutorial, just change the line color to white: python tutorial

kbarni gravatar imagekbarni ( 2020-11-19 11:32:05 -0500 )edit

I did some searches on possible solutions. I also tried the HougLinesP function as you shared. Also tried "contours" way of looking for lines in the image. But the result wasn't satisfactorily. For example, I got extra line created.

kst gravatar imagekst ( 2020-11-19 17:49:01 -0500 )edit

@kst. I deleted my answers. Because, u didn't providing second image(invalid date stamp). The first image will work w/out underline. How will I know if one of the dates is not recognizable. Even if it is both underlined or just one underline.

supra56 gravatar imagesupra56 ( 2020-11-20 07:02:05 -0500 )edit

1 answer

Sort by ยป oldest newest most voted

answered 2020-11-24 02:58:40 -0500

berak gravatar image

have a look at the morphology tutorial

a long horizontal kernel should do the trick.

edit flag offensive delete link more
Login/Signup to Answer

Question Tools

1 follower


Asked: 2020-11-19 06:32:19 -0500

Seen: 58 times

Last updated: 2 days ago