Build blocks and isolate characters

asked Feb 28 '17

macpirkimas gravatar image

updated Feb 28 '17

I have been searching for an answer for a while to this question but cannot find anything useful.

I am trying to read machine readable zone with a camera. I need to extract characters one by one from machine readable zone and feed to OCR. I tried to threshold image, to find contours, extract characters one by one but while it is on live camera find contours miss some characters and I get results not as I expected.

While machine readable zone is known size, form, is there a proper method to build blocks for each character and extract them?

ADDED CODE

rect = []
blur = cv2.medianBlur(roi_gray,3) #roi_gray is aligned horizontally MRZ zone
thresh = cv2.adaptiveThreshold(blur,255,cv2.ADAPTIVE_THRESH_GAUSSIAN_C, cv2.THRESH_BINARY,11,2)
_,contours, hierarchy = cv2.findContours(thresh.copy(),cv2.RETR_CCOMP,cv2.CHAIN_APPROX_SIMPLE)
contours = sorted(contours, key=cv2.contourArea, reverse = True)[:90]
minH = 20
minW = 20
for ctr in contours:
    if cv2.contourArea(ctr) < 1000:
        xyc,wh,a = cv2.minAreaRect(ctr)
        w,h = wh
        x,y = xyc
        if h >= minH or w >= minW:
            rect.append(cv2.boundingRect(cv2.approxPolyDP(ctr,3,True)))

rect is containing collected contours but problem is that after thresholding as example character N is splitting into two contours, or it was not found by findContours so letter is missing in finally output.

VIDEO

I have found video there seems author build blocks for each character but unfortunately author does not provide any additional information about method or code. Video link

Preview: (hide)