Problem with recognizing the characters with tesseract

asked 2019-02-08 01:02:05 -0500

Saikrishna Maddina gravatar image

Hi friends ,
Actually, I'm working with decoding the randomly generated captcha's. For this, I'm using the OpenCV image processing libraries as well as tesseract for recognizing the characters. These are the steps I followed to recognize the characters. 1.Applied gaussian blur to remove the noise with filter (3,3). 2. Then applied the normal thresholding method with some size which having only the captcha's characters and in this stage, all the external components like noise and extra lines on the characters got removed. But characters are got separated by using particular thresholding value, so I used morphological operations like dilation to join those characters. 3.morphological dilation is used to join the separated characters. 4. The last step is to give the dilated image to the tesseract API.image description(/upfiles/15496092788299433.png)Results are good with some images, but I got so many false recognitions with tesseract. So, guys can you help me to overcome this problem Note: I'm using the good resolution with (250,100). pFA for failing images.

edit retag flag offensive close merge delete