Ask Your Question

Program to recognize music notation

asked 2015-02-17 05:05:22 -0500

Peter55555 gravatar image

I wrote program to recognize music notation (exactly tabulature). In the main there are digits on the lines. After preprocessing (global binarization) I remove lines. I'd like to ask about preprocessing. My program works fine where lines and digits have the same color. But lines are often lighter than digits. When I use binarization with small threshold lines disapear when I use binarization with bigger threshold there are too many noises and digits are thick.

What kind of binarization (in OpenCV) can you recommend? What to do? Is there any solution for this problem?

I'll add some example.

My preprocessing looks like that:

1) Reading image in gray colors:

enter image description here

2) GLobal binarization:

cv::threshold (for example 127)

enter image description here

Characters are not beautiful... :( But main problem is that lines disappeared.

3) cv::threshold (230)

I can see lines but charackters are thick and ugly. For example 'a' character sometimes does not have empty space in the middle and so on. And there are a lot of noises. :(

enter image description here

And there is one more problem... I have to set threshold value for every file....

Do you have any suggestion for preprocesssing???

I'd like to have "nice" lines and characters...

(I don't ask about code, just some suggestions and advices)

edit retag flag offensive close merge delete



Can you take a look at this? It is a new sample created by @theodore, based on this Q&A topic. Due to the large amount of these music sheet questions lately, I am guessing somewhere a computer vision professor gave this as homework :D

StevenPuttemans gravatar imageStevenPuttemans ( 2015-02-17 06:30:43 -0500 )edit

actually if I got it correctly what you are looking for is how to refine your input image. If that is your case then I would suggest to try some filtering and sharpening techniques (if you search you will find plenty, including code, both here in this forum as well in the web in general) as pre-processing procedures. @StevenPuttemans most likely I ruined the plans of some professor somewhere .:evil:. :-p

theodore gravatar imagetheodore ( 2015-02-17 16:17:47 -0500 )edit

Is it possible to recognize music from a youtube song video? link text

fibra gravatar imagefibra ( 2019-05-20 07:46:11 -0500 )edit

@fibra, please do not post answers here, if you have a question or a comment, thank you.

berak gravatar imageberak ( 2019-05-20 09:37:27 -0500 )edit

1 answer

Sort by ยป oldest newest most voted

answered 2015-02-17 06:25:21 -0500

Guanta gravatar image

The global optimal threshold can be achieved using Otsu's method, i.e.

int best_threshold = cv::threshold(img, out, 1, 255, cv::THRESH_BINARY + cv::THRESH_OTSU);

Your image thresholded using Otsu:

image description

However, what you probably need is a local thresholding method, e.g. the binarization method of Sauvola et. al or Bradley et al. (just search for binarization with the author name and you will find enough material). However note that also then it may happen that you need to tune parameters of these methods.

edit flag offensive delete link more



another approach would also be to segment image into smaller blocks, and apply otsu's method into these smaller blocks instead of the whole image. That most likely would give a bit better results.

theodore gravatar imagetheodore ( 2015-02-17 10:00:12 -0500 )edit
Login/Signup to Answer

Question Tools

1 follower


Asked: 2015-02-17 05:05:22 -0500

Seen: 2,064 times

Last updated: Feb 17 '15