Revision history [back]

Here are a few ideas:

First convert from color to grayscale. It looks like you have fairly good contrast already. There are various methods to perform this conversion; choose the simplest at first: gray = (red + green + blue)/3. Quite often you don't need anything better than that. For some applications, using just the green color plane is sufficient. If you want the algorithm to operate on black text, but if some images have white or light text on a dark background, reverse the polarity of the grayscale image by swapping grayscale 255 for 0, 254 for 1, and so on until all images have black text on a light-color background. You'll need an algorithm to detect when this is necessary, OR you can run an algorithm on the original grayscale image and on the negative (reversed brightness) image and choose the better result. Keep in mind that the "close" morphological operation has a different meaning for light-on-dark text versus dark-on-light text. Investigate some adaptive thresholding methods. Using a global Otsu threshold can be problematic--if possible, you should first have some sort of method to detect regions that are likely to be license plates, and then threshold and binarize those regions only. Thresholding an entire image will likely cause trouble. Look into the Stroke Width Transform and improvements upon it. The SWT or equivalent can help either with segmentation, or with identifying likely license plate regions, or possibly even with providing edges useful for identification. Since in your region there are only a few color combinations, you might be able to create some method that would could a template for every possible color combination and then try to find a match at each of several image scales. The license plate might be 400 x 250 pixels, or 100 x 65 pixels, or whatever. For multiscale operations, consider holding an image pyramid in memory. In some cases you can perform an initial match in the smallest image of the pyramid and work your way down. With some modification you could create an image pyramid has a scaling factor of less than 2 from level to level. Pull the sample images into Photoshop or GIMP and play around with the filters. You may be surprised to find a filter that is helpful. Whenever possible, work on raw, uncompressed images rather than on JPEGs or other compressed images. You may not have much of a choice in hardware, but given the fuzzy look of your "LEA 1956" sample, it'd be nice to have more resolution (more pixels per image) and/or sharper optics and/or a raw image instead of a JPEG. Try to generate the best possible images that you can with pre-processing before your SVM goes to work. Thresholding is often the weak link here. Thresholding and binarization is a common approach, but when you see how noisy a binary image can be you should think very hard whether that is the kind of image you actually want to process (no matter how many people may claim that this is the standard method and hence the best).

Even if you're already starting with a framework and/or algorithms, I recommend doing a little background reading to gain an understanding of both human reading and the many approaches to OCR:

Character Recognition Systems by Cheriet and others is a good reference (though a few years old now) of the various techniques for Optical Character Recognition.

Reading in the Brain by Stanislas Dehaene gives a good high-level overview of human reading, and covers the range of scripts. It's a relatively fast read, and has references to academic works that are useful.

source

Here are a few ideas:

If you want the algorithm to operate on black text, but if some images have white or light text on a dark background, reverse the polarity of the grayscale image by swapping grayscale 255 for 0, 254 for 1, and so on until all images have black text on a light-color background. You'll need an algorithm to detect when this is necessary, OR you can run an algorithm on the original grayscale image and on the negative (reversed brightness) image and choose the better result. Keep in mind that the "close" morphological operation has a different meaning for light-on-dark text versus dark-on-light ~~text.~~ text.

Investigate some adaptive thresholding methods. Using a global Otsu threshold can be problematic--if possible, you should first have some sort of method to detect regions that are likely to be license plates, and then threshold and binarize those regions only. Thresholding an entire image will likely cause ~~trouble.~~ trouble.

Look into the Stroke Width Transform and improvements upon it. The SWT or equivalent can help either with segmentation, or with identifying likely license plate regions, or possibly even with providing edges useful for ~~identification.~~ identification.

Since in your region there are only a few color combinations, you might be able to create some method that would could a template for every possible color combination and then try to find a match at each of several image scales. The license plate might be 400 x 250 pixels, or 100 x 65 pixels, or whatever. For multiscale operations, consider holding an image pyramid in memory. In some cases you can perform an initial match in the smallest image of the pyramid and work your way down. With some modification you could create an image pyramid has a scaling factor of less than 2 from level to ~~level.~~ level.

Pull the sample images into Photoshop or GIMP and play around with the filters. You may be surprised to find a filter that is ~~helpful.~~ helpful.

Whenever possible, work on raw, uncompressed images rather than on JPEGs or other compressed ~~images.~~ images.

You may not have much of a choice in hardware, but given the fuzzy look of your "LEA 1956" sample, it'd be nice to have more resolution (more pixels per image) and/or sharper optics and/or a raw image instead of a ~~JPEG.~~ JPEG.

Try to generate the best possible images that you can with pre-processing before your SVM goes to work. Thresholding is often the weak link here. Thresholding and binarization is a common approach, but when you see how noisy a binary image can be you should think very hard whether that is the kind of image you actually want to process (no matter how many people may claim that this is the standard method and hence the best).

Even if you're already starting with a framework and/or algorithms, I recommend doing a little background reading to gain an understanding of both human reading and the many approaches to OCR:

Character Recognition Systems by Cheriet and others is a good reference (though a few years old now) of the various techniques for Optical Character Recognition.

source