Ask Your Question

roqvist's profile - activity

2020-07-08 01:44:36 -0600 received badge  Notable Question (source)
2019-05-30 17:04:05 -0600 received badge  Popular Question (source)
2016-02-12 23:50:06 -0600 asked a question Using HoughLinesP to straighten skewed receipt scans

I am trying to use HoughLinesP with Python to identify lines from text to eventually straighten up a bunch of images. The images are scanned receipts, fairly high DPI, so I scale them down and pre-process them with noise removal and threshold so they look like this:

image description

I was hoping the majority of the lines found by HoughLines would align with the text flow direction (the text's horizontal axis). I played around with the different parameters of the HoughLinesP method in Python but cannot find a good way to accurately do this. For some reason most of the lines that are found are along the texts' vertical axis instead, which to me seems odd since the lines are definitely longer and more precise along the texts' horizontal axis. Here is an example (HoughLines drawn in thin grey lines) with the following input values:

minLineLength = 45 (one sixth of image width)
maxLineGap = 5
pixelRes = 1
rotationRes = pi/180
threshold = 200

image description

Typically the receipts are off by +/- 10-45 degrees or so, so the text flow is almost always closer to horizontal than vertical. Not sure what I'm missing here, is there any way to tweak the HoughLinesP method to better identify the general flow of the text in this type of image?