Detect roman numerals in image
Hi there I'm new in OpenCV and Python/C++. I've already worked through the tutorials on this site and now wanted to do some own exercices with OpenCV 2.4 and Python 2.7. I would like to recognize a black colored roman numeral (e.g. I, II, III, IV, V, ...) on a white background in an image and try to get a value according to this roman numeral.
I've already tried some OpenCV techniques like: - harrisCorner (counting the corners of the black "object" and return the value in an if-else contruct) e.g. roman numeral I has 12 corners (considering the serifs). - roughlines (counting long horizontal and vertical lines in the image and return the value in an if-else contruct) - templateMatching (compare the "scene" image with a template image of a roman numeral)
In simple cases these techniques are working. But most of the images contain noise and the matching fails. I think i should do image processing before to crop the black numeral and white background out of the image.
So my questions are: - Do you think these techniques or especially one would be good to detect the roman numeral? - Are there any image processing tutorials to extract a black object on a white background?
There are some examples:
Thanks for help
For the image processing you can try (maybe adaptive) thresholding and then opening+closing to eleminate noise to some extent. You can check out tesseract library if reading numerals is your main goal.
Thanks for respond. I've already done some open-closing tutorials as well as cv2.bilateralFilter() and cv2.equalizeHist() for noise reduction.
Now i'm getting a nearly "noisless" image. But the main problem is, that i have to separate the black-white part (roman numeral and white background) from the scene image. Would you use the crop() function or do you think this is not a necessary step in order to detect the roman numeral?
You're right, reading numerals is the main goal, but I think ocr tesseract is not robust enough, considering that we talk about roman numerals with serif. I did some tutorials with ocr tesseract and it doesn't work well (just a small success rate). Don't you think cv2.matchTemplate would be more appropriate? Or like I mentioned counting corners or lines?
I'm working on a similar task: detecting some symbols on white backgroud. I'm using templates, because it seemed easier to implement. However I'm facing a number of problems. Maybe some of them applies to your application as well. Same symbol can appear in a numer of shapes. Sometimes lines are thicker or symbol itself is wider in one dimension. If your dataset contains different fonts etc. this can be an issue for you too. I set more than one pattern for each symbol Which helped a little. Some symbols are similar to each other (only difference being one dot etc.) which leads false positives. Some symbols are too basic, like empty triangle or circle. This shapes can appear in image by chance For your case maybe "I" can be a problem.
Cropping reduces size of the image you are working on which is nice, but I doubt it will help with the detection rate. If your region of interest has white background clearly seperated from the rest of the image, then you won't get any detection elsewhere anyways. If your scene is contains light colours for some reason then it will hard to determine where to crop.
I must say I'm beginner myself, these are only ideas. I can be wrong.
I guess working with corners and lines may solve some of the problems I mentioned above. But it may have its own issues.
Wow, thanks for your help! So you say it's not necessary to crop the image? Then I consider doing some image processing and then a template matching..
Which template matching method do you use in your project? cv2.matchTemplate()? An then train your algorithm? Did you find some helpful tutorials for your project?
Add some samples from the images you have to get better answer
I would start without cropping. I guess it can be added later if needed.
I'm using matchTemplate with TM_CCOEFF_NORMED. I did some experiment with matching methods, this one gave better results for my case. But I guess it is more computation intensive.
I couldn't find any relevant tutorials for my case yet. But I didn't look throughly either.