Ask Your Question

Revision history [back]

click to hide/show revision 1
initial version
dilated_img = cv2.dilate(img, np.ones(  *** (7, 7) *** , np.uint8))
bg_img = cv2.medianBlur(dilated_img, *** 21 *** )

These parameters determine the upper limit of feature size (text stroke width) that your thresholding algorithm will be able to handle. When the actual text stroke width is above that (such as the case of large bold text found in the company logo), your algorithm doesn't see it as text, but rather a large area filled with a dark color.

The solution to this problem depends on whether you truly want to handle thresholding of objects regardless of feature size (i.e. whether you want to choose an upper limit on text stroke width, or want to avoid choosing it).

If you still want to choose an upper limit, just increase the size numbers in your code.

If you don't want to choose an upper limit - instead, you are willing to change the algorithm to a more complicated one, in order to avoid this problem, then you will need to read research papers. Focus on document thresholding, binarization, boundary, blob detection, and interpolation.

In general, the more complicated approach consist of solving for a "thresholding surface". In fact, your current approach is already using the same idea:

diff_img = 255 - cv2.absdiff(img, bg_img)

Here, the bg_img is your thresholding surface, which you compare with (subtract from) the input image.

To solve for the thresholding surface regardless of feature size, it is necessary to focus on mathematical definitions of object boundaries, and interpolation of that thresholding surface so that it covers the whole image.