Ask Your Question

Revision history [back]

click to hide/show revision 1
initial version

How to sort contours left to right, while going top to bottom

I'm finding the contours for an image with digits and characters, for OCR. So, I need the contours to be sorted left to right, while going line to line, i.e. top to bottom. Right now, the contours aren't sorted that way.

image description

For example, the contours for the above image is sorted as 1,4,0,9,5,8,3,2,6,4,1,9,8,7,5,3,2,6

What I need is the sorting as 1,2,3,4,5,6,7,8,9,2,4,3,6,1,8,5,9,0

How to sort contours left to right, while going top to bottom

I'm finding the contours for an image with digits and characters, for OCR. So, I need the contours to be sorted left to right, while going line to line, i.e. top to bottom. Right now, the contours aren't sorted that way.

image descriptionimage description

For example, the contours for the above image is sorted as 1,4,0,9,5,8,3,2,6,4,1,9,8,7,5,3,2,6randomly.

What I need is the sorting as 1,2,3,4,5,6,7,8,9,2,4,3,6,1,8,5,9,0D,o,y,o,u,k,n,o,w,s,o,m,e,o,n,e,r,.(dot),i(without dot),c,h...and so on. I've tried couple of methods where we first observe the y-coordinate and then use some keys and the x-coordinate. Like right now, I have the following sorting code. It works for the first 2 lines. Then in the 3rd line, the sorting somehow doesn't happen. The main problem seem to be in the letters such as i, j, ?, (dot), (comma), etc where the y axis of the (dot) varies, despite belonging to the same line. So what might be a good solution for this?

How to sort contours left to right, while going top to bottom

I'm finding the contours for an image with digits and characters, for OCR. So, I need the contours to be sorted left to right, while going line to line, i.e. top to bottom. Right now, the contours aren't sorted that way.

image description

For example, the contours for the above image is sorted randomly.

What I need is the sorting as D,o,y,o,u,k,n,o,w,s,o,m,e,o,n,e,r,.(dot),i(without dot),c,h...and so on. I've tried couple of methods where we first observe the y-coordinate and then use some keys and the x-coordinate. Like right now, I have the following sorting code. It works for the first 2 lines. Then in the 3rd line, the sorting somehow doesn't happen. The main problem seem to be in the letters such as i, j, ?, (dot), (comma), etc where the y axis of the (dot) varies, despite belonging to the same line. So what might be a good solution for this?

for ctr in contours:    
    if cv2.contourArea(ctr) > maxArea * areaRatio: 
        rect.append(cv2.boundingRect(cv2.approxPolyDP(ctr,1,True)))

for i in rect:
    x = i[0]
    y = i[1]
    w = i[2]
    h = i[3]

    if(h>max_line_height):
        max_line_height = h

mlh = max_line_height*2
max_line_width = raw_image.shape[1] #width of the input image
mlw = max_line_width
rect = np.asarray(rect)
s = rect.astype( np.uint32 ) #prevent overflows
order= mlw*(s[:,1]/mlh)+s[:,0]
sort_order= np.argsort( order )
rect = rect[ sort_order ]

How to sort contours left to right, while going top to bottom

I'm finding the contours for an image with digits and characters, for OCR. So, I need the contours to be sorted left to right, while going line to line, i.e. top to bottom. Right now, the contours aren't sorted that way.

image description

For example, the contours for the above image is sorted randomly.

What I need is the sorting as D,o,y,o,u,k,n,o,w,s,o,m,e,o,n,e,r,.(dot),i(without dot),c,h...and so on. I've tried couple of methods where we first observe the y-coordinate and then use some keys and the x-coordinate. Like right now, I have the following sorting code. It works for the first 2 lines. Then in the 3rd line, the sorting somehow doesn't happen. The main problem seem to be in the letters such as i, j, ?, (dot), (comma), etc where the y axis of the (dot) varies, despite belonging to the same line. So what might be a good solution for this?

for ctr in contours:    
    if cv2.contourArea(ctr) > maxArea * areaRatio: 
        rect.append(cv2.boundingRect(cv2.approxPolyDP(ctr,1,True)))

for i in rect:
    x = i[0]
    y = i[1]
    w = i[2]
    h = i[3]

    if(h>max_line_height):
        max_line_height = h

mlh = max_line_height*2
max_line_width = raw_image.shape[1] #width of the input image
mlw = max_line_width
rect = np.asarray(rect)
s = rect.astype( np.uint32 ) #prevent overflows
order= mlw*(s[:,1]/mlh)+s[:,0]
sort_order= np.argsort( order )
rect = rect[ sort_order ]