Ask Your Question

Revision history [back]

click to hide/show revision 1
initial version

Line segment detection Near horizontal

Hi,

I started my search on stackoverflow. As you may read in the given link I have several PDF's which I have now converted to JPG's in a decent format. With following script:

        "C:\Program Files\gs\gs9.15\bin\gswin64c.exe" -q -dQUIET -dSAFER -dBATCH -dNOPAUSE -dNOPROMPT -dMaxBitmap=500000000 -dAlignToPixels=0 -dGridFitTT=2 -sDEVICE=jpeg -dTextAlphaBits=4 -dGraphicsAlphaBits=4 -r150 -sOutputFile=' ."$without_extension.jpg $_"

With perl for who is wondering.

So I have a bunch of JPG's which are like:

C:\fakepath\Opwekking0001.jpg

C:\fakepath\Opwekking0002.jpg

C:\fakepath\Opwekking0003.jpg

Most important part of these pictures is the thick black vertical and horizontal lines and the thick number which represents the number of the song.

The goal is to get the coordinates of the sections split by horizontal lines (and the pages are all in 2 columns).

I have researched and found that HoughP would be the solution. Correct me if I'm wrong please. So as test I constructed C++ code with OpenCV to recognize the horizontal and the vertical lines since I'll have to split on these. I have drawn an example on stackoverflow.

As extra I would like to detect the thick black number which points to the number of the song. But first things first. Detection of the lines is not going as expected:

Result:

C:\fakepath\LineDetection1.jpg

C:\fakepath\LineDetection2.jpg

C:\fakepath\LineDetection3.jpg

Same order as the originals. As you can see not all horizontal lines are given. And some are not completely till the end or they are broken up in pieces. Example 2 on the other hand is exactly what I want.

The code I have pasted on pastebin since this post is getting quite long.

http://pastebin.com/vxyLW6i3

I hope this is clear enough. I'll provide anything if you need more information. Maybe I'm even taking the wrong angle to getting the sections if you have a better idea please do share.