How to eliminate Text underline?
I have a scanned document which I will like to remove the underlined text before I run OCR through it. The reason why I need to remove the underline is because I noticed the accuracy of the OCR to recognize is bothered by the underlined.
For example, in the attached image, if I removed the underlines, the 2 dates can be recognized accurately else, one of the dates is not recognizable.
Any python sample code is much appreciated.
Thanks
What did you try? Did you bother to search for a solution?
Hint: the HoughLinesP function is what you are looking for. Here's a tutorial, just change the line color to white: python tutorial
I did some searches on possible solutions. I also tried the HougLinesP function as you shared. Also tried "contours" way of looking for lines in the image. But the result wasn't satisfactorily. For example, I got extra line created.
@kst. I deleted my answers. Because, u didn't providing second image(invalid date stamp). The first image will work w/out underline. How will I know if one of the dates is not recognizable. Even if it is both underlined or just one underline.