1 | initial version |
I think in this case connected components is better for filtering. It's quite fast, so that's not a problem. The reason it's better is because you can more easily filter for properly sized objects. You simply remove any components with a height or width greater or less than a reasonable threshold. So you can remove long vertical lines, horizontal lines, or small specks but keep the numbers.
Then you can filter some more by the region of interest. Use find contours to find the boxes that are the paper with the numbers on them. There's a square detection tutorial somewhere around here. Only keep any of the connected components that are within the boxes.
Between these two, you shouldn't have much noise left. I would simply compare whatever is left to your numbers and just ignore anything that isn't a close match.
Equalize Hist may be useful, but you should be okay without it, if that's a normal image. Can't hurt to give it a try and see if it helps though.