Ask Your Question
0

Can openCV be used to extract character strings from images?

asked 2014-02-11 17:49:47 -0600

sherrellbc gravatar image

updated 2014-02-12 02:15:16 -0600

berak gravatar image

I have an idea of sorting through IC chips based on their types for a design project. Of course, the information about each device is printed on the top in a string of characters:

enter image description here

The idea is that we could take a picture of this device, and process the image and extract the string "74AC139PC." The basis of the project would be taking a bin of random DIP chips and sorting through them to find the ones associated with a string input to the program by a user.

How difficult would it be to extract such information from an image? The process is simplified because most chips have a nice white/gold text overlaid onto a black background. Further, the text is usually formatted just like the image above, so no fancy text is used.

Any suggestions on where to start?

edit retag flag offensive close merge delete

2 answers

Sort by ยป oldest newest most voted
2

answered 2014-02-12 03:16:54 -0600

dervish gravatar image

updated 2014-02-12 03:17:36 -0600

The easiest path is to integrate tesseract library in your Opencv project, then use its API to recognize your characters. It's so possible to retrain your charset, if you suppose that your letters are special.

to install tesseract : see this tuto

this is an example of one of my projects:

#include <tesseract/baseapi.h>
#include <leptonica/allheaders.h>
#include <iostream>
#include <string.h>
#include <vector>
#include <opencv2/highgui/highgui.hpp>
#include "opencv2/imgproc/imgproc.hpp"
#include "opencv2/nonfree/features2d.hpp"

int main( int argc, char** argv )
{

  cout << "OCR: starts" << endl;
  Mat scene_plate = imread("plate.jpg", CV_LOAD_IMAGE_COLOR );
  read_image(scene_plate.cols, scene_plate.rows, (char*)scene_plate.ptr());
}


 void read_image(int width, int height, char *image)

 {
    cv::Mat Image(height, width, CV_8UC3, image);
    // you may need to define the area of interest, where the test is found

// initializing Tesseract API
 char *outText;
tesseract::TessBaseAPI *tess_api = new tesseract::TessBaseAPI();
if (tess_api->Init(NULL, "eng"))  // eng is a flag of which trained language you use, if you just train your own language, you gave "XYZ" as a falge, you have to use it here
   {
    cout<<"Could not initialize tesseract.\n";
    exit(1);
}
tess_api->SetVariable("tessedit_char_whitelist", "ABCDEFGHIJKLMNOPQRSTUVWXYZ012345789.-");
tess_api->SetImage((uchar*)plate_image.data, plate_image.size().width,    plate_image.size().height, plate_image.channels(), plate_image.step1());
tess_api->Recognize(0);
char* out =tess_api->GetUTF8Text();

double confidence =ocr_plate.confidence = tess_api->MeanTextConf();

    cout<<"OCR output:"<< out<< "  with confidence "<<confidence<<endl;

 }
edit flag offensive delete link more

Comments

Bogus code.

LogicStuff gravatar imageLogicStuff ( 2017-02-28 02:06:18 -0600 )edit
1

answered 2014-02-12 00:28:31 -0600

Irene gravatar image

hello, i suggest you look at this topic character recognition. There seem to be a few links with tutorials and code samples.

edit flag offensive delete link more

Question Tools

Stats

Asked: 2014-02-11 17:49:47 -0600

Seen: 7,703 times

Last updated: Feb 12 '14