Preparing for OCR

asked 2013-05-28 12:19:11 -0600

Mr. Gecko gravatar image

Hello, I am trying to write a program that can OCR well. The thing I'll be mostly OCRing would be subtitles on images like what you'll see on many memes or from a DVD screen capture. The problem with OCRing these are that there is way too much noise. So my goal would be to remove most of the noise to make the OCRing process produce better results.

Here are my thoughts as to how this could be done. 1. Convert image to gray scale. 2. Somehow detect where the text is. 3. Discover the shade of gray the text uses. 4. Remove all other shades of gray replacing with a solid shade (black if text is white, white if text is black).

Can someone help me figure this out? I am using tesseract-ocr as my OCR API.

edit retag flag offensive close merge delete


  1. Convert to gray scale. cv::Mat gray; cv::cvtColor(im, gray, CV_BGR2GRAY);
  2. Unknown...
  3. Unknown.

  4. <- basically we'll find every pixel which is not the same color as the text, with a threshold so that we don't black out artifacts of the text, and black them out. When we find something that is the color of the text, we make them white. Therefore making very vivid text.

Mr. Gecko gravatar imageMr. Gecko ( 2013-05-31 09:32:37 -0600 )edit

cv::threshold makes OCR work well on many images I have, but not all... E.G. I have an image with black bordered white text on top of someone's hands. Their skin tones turn white below the text and confuses tesseract. The result I get from that image is below. "4‘ L ‘
filly; //,9"

Code so far

I still think my idea would work well if I could find out how to get the color of text.

Mr. Gecko gravatar imageMr. Gecko ( 2013-05-31 12:02:37 -0600 )edit