Hello, I am trying to write a program that can OCR well. The thing I'll be mostly OCRing would be subtitles on images like what you'll see on many memes or from a DVD screen capture. The problem with OCRing these are that there is way too much noise. So my goal would be to remove most of the noise to make the OCRing process produce better results.
Here are my thoughts as to how this could be done. 1. Convert image to gray scale. 2. Somehow detect where the text is. 3. Discover the shade of gray the text uses. 4. Remove all other shades of gray replacing with a solid shade (black if text is white, white if text is black).
Can someone help me figure this out? I am using tesseract-ocr as my OCR API.