Ask Your Question
0

I lose some text while making scanner-like effect

asked 2018-09-26 03:23:17 -0600

iliachigo gravatar image

updated 2018-09-26 03:56:34 -0600

Hey, My goal is to create scanner-like effect. I have image taken by the phone and my result should make it look like it was scanned. My problem is that my code doesn't work on pictures, or big bold text. You can see that from the text/logo at the upper left corner of the original picture, only it's borders were left in the final result. I want it to be filled. How can I achieve that?

This is my original photo

dilated_img = cv2.dilate(img, np.ones((7, 7), np.uint8))

after this line I get this result Dilated

bg_img = cv2.medianBlur(dilated_img, 21)

I apply blur and get result like this Blurred

diff_img = 255 - cv2.absdiff(img, bg_img)

Then I take difference and get this result Differences

norm_img = diff_img.copy() # Needed for 3.x compatibility
cv2.normalize(diff_img, norm_img, alpha=0, beta=255, norm_type=cv2.NORM_MINMAX, dtype=cv2.CV_8UC1)
_, thr_img = cv2.threshold(norm_img, 230, 0, cv2.THRESH_TRUNC)
cv2.normalize(thr_img, thr_img, alpha=0, beta=255, norm_type=cv2.NORM_MINMAX, dtype=cv2.CV_8UC1)

After normalization I get this, final, result Final Result

When comparing original image and the result you can clearly see that at the upper left corner from the text/logo only it's edges were left. How can I tweak my code so that that logo remains filled?

edit retag flag offensive close merge delete

Comments

please put your images here, not on an external bin, where they'll expire, thank you !

(also, having to look at "raw" imgur links, with all that 4chan shit is very annoying, don't do that ever !)

berak gravatar imageberak ( 2018-09-26 03:34:21 -0600 )edit

I tried... But I couldn't upload pics... I don't know why. It shows no error message :(

iliachigo gravatar imageiliachigo ( 2018-09-26 03:37:25 -0600 )edit
1

Done. It looks resolution was too big. I had to resize them

iliachigo gravatar imageiliachigo ( 2018-09-26 03:55:15 -0600 )edit

1 answer

Sort by ยป oldest newest most voted
0

answered 2018-09-27 14:24:22 -0600

rwong gravatar image
dilated_img = cv2.dilate(img, np.ones(  *** (7, 7) *** , np.uint8))
bg_img = cv2.medianBlur(dilated_img, *** 21 *** )

These parameters determine the upper limit of feature size (text stroke width) that your thresholding algorithm will be able to handle. When the actual text stroke width is above that (such as the case of large bold text found in the company logo), your algorithm doesn't see it as text, but rather a large area filled with a dark color.

The solution to this problem depends on whether you truly want to handle thresholding of objects regardless of feature size (i.e. whether you want to choose an upper limit on text stroke width, or want to avoid choosing it).

If you still want to choose an upper limit, just increase the size numbers in your code.

If you don't want to choose an upper limit - instead, you are willing to change the algorithm to a more complicated one, in order to avoid this problem, then you will need to read research papers. Focus on document thresholding, binarization, boundary, blob detection, and interpolation.

In general, the more complicated approach consist of solving for a "thresholding surface". In fact, your current approach is already using the same idea:

diff_img = 255 - cv2.absdiff(img, bg_img)

Here, the bg_img is your thresholding surface, which you compare with (subtract from) the input image.

To solve for the thresholding surface regardless of feature size, it is necessary to focus on mathematical definitions of object boundaries, and interpolation of that thresholding surface so that it covers the whole image.

edit flag offensive delete link more

Question Tools

1 follower

Stats

Asked: 2018-09-26 03:21:25 -0600

Seen: 1,169 times

Last updated: Sep 27 '18