# How to sort contours left to right, while going top to bottom I'm finding the contours for an image with digits and characters, for OCR. So, I need the contours to be sorted left to right, while going line to line, i.e. top to bottom. Right now, the contours aren't sorted that way. For example, the contours for the above image is sorted randomly.

What I need is the sorting as D,o,y,o,u,k,n,o,w,s,o,m,e,o,n,e,r,.(dot),i(without dot),c,h...and so on. I've tried couple of methods where we first observe the y-coordinate and then use some keys and the x-coordinate. Like right now, I have the following sorting code. It works for the first 2 lines. Then in the 3rd line, the sorting somehow doesn't happen. The main problem seem to be in the letters such as i, j, ?, (dot), (comma), etc where the y axis of the (dot) varies, despite belonging to the same line. So what might be a good solution for this?

for ctr in contours:
if cv2.contourArea(ctr) > maxArea * areaRatio:
rect.append(cv2.boundingRect(cv2.approxPolyDP(ctr,1,True)))

for i in rect:
x = i
y = i
w = i
h = i

if(h>max_line_height):
max_line_height = h

mlh = max_line_height*2
max_line_width = raw_image.shape #width of the input image
mlw = max_line_width
rect = np.asarray(rect)
s = rect.astype( np.uint32 ) #prevent overflows
order= mlw*(s[:,1]/mlh)+s[:,0]
sort_order= np.argsort( order )
rect = rect[ sort_order ]

edit retag close merge delete

Sort by » oldest newest most voted

Firstly you need to use boundingRect() to generate Rect, which will give you coordination. (and looks like you did.)

Then you can use sorted() with a key function to generate key value using (x, y), something like key = y * 3000 + x.

This should do the work.

more

Could you please provide a code snippet for what you've just described? Also will that work for sorting alphabet contours as well? They have different y values for taller and smaller characters.

The problem you are looking for called "Page layout detection and character segmentation" , the generic steps go as follow:

1. Detect page zones such as , Text Headers , Text paragraph , Graphics and pictures , tables , ....
2. For Text zones (Header , table cell , paragraph) do the following.
3. Split into lines
4. split line into words
5. split word into characters

In your case you only have one paragraph -you can split paragraph by using horizontal histogram and cut line on local minimum , or you can use contours by adding regions which share vertically some height threshold into one line.

• -sort the lines from top to bottom.
• -for every line sort regions from left to right.
• -In one line if there is horizontal overlap , merge the two regions in one bigger region (this will solve i,j problems)
• -Then you can split the line into characters by taking every region as one character or ligature (rr,ff,vv).

finally if you need ready made solution , Tesseract can do all previous tasks plus the recognition

more

Official site

GitHub

Wiki

Documentation