Ask Your Question
0

Is there any solution with opencv which complete forms ?

asked 2016-03-20 06:39:08 -0600

nadia gravatar image

updated 2016-03-20 07:25:12 -0600

I have to make a clear picture with opencv to apply then OCR with tesseract .I try many methods but picture still have incompletes letters (so i will not be able to identify theses letters by OCR ). t tried cvtColor to gray -->fastNlMeansDenoising -->adaptiveThreshold-->Canny-->and finally find and draw contours

image description

edit retag flag offensive close merge delete

Comments

-->adaptiveThreshold-->Canny--> makes no sense. Both are binarisation methods, so adaptiveThreshold-->findContours should do the same job.

Have you tried the other suggestions I posted here?

matman gravatar imagematman ( 2016-03-20 08:42:08 -0600 )edit

I don't understand how i should use cv::niBlackThreshold() in my case let's say the picture presents broken characters(i read the article but it propose differents methods a) Niblack b) Sauvola c) Wolf d) Feng e) NICK ).I am really confused

nadia gravatar imagenadia ( 2016-03-20 09:51:05 -0600 )edit

The article should give you an idea what value range for k you can use and how the results could look like. On Github I have posted a code snippet, where I have implemented the other methods (Sauvola, Wolf and NICK) (not sure if the implementation is correct, but the results looked good). You have to replace the switch cases (CV_THRESH_SAUVOLA,...) by numbers or define it by yourself.

I think the issue with the broken edges is not that trivial and you can't solve this by just doing filtering and binarisation.

My approach would be, to focus only on the complete characters in the first step. Then you can extract the character size, row position, etc... With these information you can extrapolate the regions which contain the broken characters and then deal with them in an individual way.

matman gravatar imagematman ( 2016-03-20 12:04:22 -0600 )edit

1 answer

Sort by ยป oldest newest most voted
0

answered 2016-03-20 08:18:00 -0600

Tetragramm gravatar image

Hmm. Two suggestions to start with. First, draw the contours with -1 to fill the whole area. That will help with the spots inside the letters, or it should.

Second, most of your disconnects happen in the vertical direction. Try making a tall thin morphology operator and running that over the image doing a close operation. You can help keep it from messing up by segmenting the rows and doing each one individually. THIS post can help you with that.

edit flag offensive delete link more

Question Tools

2 followers

Stats

Asked: 2016-03-20 06:39:08 -0600

Seen: 253 times

Last updated: Mar 20 '16