How to pre-process images for OCR ?

asked 2020-09-21 10:31:24 -0500

I am trying to do an OCR on multiple images, using Tesseract, but i am facing a problem where i can't figure out a way to apply on all images even in different contrast and brightness and shades.

As of now, this is my pipeline of preprocessing an image :

 - FastNLMeansDenoisingClored.
 - Dilation
 - MedianBlur
 - Grayscale
 - AdaptiveThresholding

But for each image, i must manually tune params for each step.

Is there anything that can adjust these params automatically, or is there any way of applying machine learning ?

edit retag flag offensive close merge delete


anything that can adjust these params automatically,

there's no silver bullet here, so -- no universal preprocessing for varying input

berak gravatar imageberak ( 2020-09-21 11:51:45 -0500 )edit

A Generalization of Otsu's Method and Minimum Error Thresholding. See the end of the paper for Python code.

Der Luftmensch gravatar imageDer Luftmensch ( 2020-09-21 15:00:04 -0500 )edit

I think it deserve to try Machine learning to detect the noise type in the image , it's good idea and it can be area of research . the problem here you need to find the data which will be train the model to detect the noise types in the image. you can prepare the train data for the model manually by creating labeled images , the labels are the noise types , manual data preparing is very time consuming task. or you can create synthesized training , this can be done by using noise filters on the images also you can use GANs nerual network to generate more training data for the model..

essamzaky gravatar imageessamzaky ( 2020-09-21 16:36:21 -0500 )edit