Extract Text From ID CARD

asked 2015-10-15 14:08:25 -0500

asad muneer gravatar image

updated 2015-10-16 15:14:07 -0500

Hi i am very new to OpenCV and indeed to this forum as well ,My question is ... How can i determine the location of TEXT ( ASAD MUNEER 12K-2192 ) ( see image ) with OpenCV and C++, iam trying to dynamically locate the location in the image. My main point is to read these characters in the image and later on convert them to text (using tesseract ) . I would like to do with in C++ openCV .

please find the attached image i would like to read is " ASAD MUNEER 12K-2192 " i just want to extract this information from ID CARD .

many thanks in advance Asad

Image Link : https://www.dropbox.com/s/evh8zg7ikqn...

edit retag flag offensive close merge delete



if your solution would focus only to the attached card, my solution would start by applying some color segmentation in order to extract the area that contains the information into a ROI and then I would apply some OCR approach, most likely based on the tesseract library (it is included in opencv functionality). Bear in mind that most likely you might be needed to apply some pre-processing and post-processing procedures (e.g. smoothing, thresholding, etc...) in between in order to obtain the best result.

theodore gravatar imagetheodore ( 2015-10-15 16:46:31 -0500 )edit

i just want to know about how to extract that part in green (Asad Muneer ,12K-2192) , i have configured tesseract for OCR purpose since i am new to opencv i need a path to follow on processing this image :/

i followed this tutorial but results were'nt exactly what i wanted http://opencv-code.com/tutorials/how-...

asad muneer gravatar imageasad muneer ( 2015-10-16 00:41:49 -0500 )edit

@asad muneer, I suggest you start by trying and when an error pops up, you report back the error. Because we will not supply fully functional code - this is not the goal of this forum -

StevenPuttemans gravatar imageStevenPuttemans ( 2015-10-16 03:26:59 -0500 )edit

@StevenPuttemans i've managed to do this so far (see link for image ) , could you tell me how i can remove these white dots , b/c when i am passing this image to tesseract it recognizes nothing. Thanks Processed Image Link https://www.dropbox.com/s/8994zjrij6k...

asad muneer gravatar imageasad muneer ( 2015-10-16 15:12:06 -0500 )edit

@asad muneer this is called salt and pepper noise and you can remove it by applying some median filtering. Have a look here how to apply this filter and others as well.

theodore gravatar imagetheodore ( 2015-10-16 17:31:24 -0500 )edit

I have no personal experience with tesseract, but I can imagine that horizontal alignment is a requirement. So first apply median filtering like @theodore said, then try aligning the remaining letters.

StevenPuttemans gravatar imageStevenPuttemans ( 2015-10-20 02:47:53 -0500 )edit