Ask Your Question

Revision history [back]

click to hide/show revision 1
initial version

Tesseract accuracy

I try use Tesseract from Opencv, but the recognition accuracy is terrible compared to the direct use Tesseract.

Test code:

cv::Ptr<text::OCRTesseract> ocr = text::OCRTesseract::create(ssTessdata.c_str(), ssLang.c_str());
ocr->run(CurImg, ssRecText);
tesseract::TessBaseAPI *ocr1 = new tesseract::TessBaseAPI();

if (ocr1->Init(ssTessdata.c_str(), ssLang.c_str()) == -1) {
    ssRecText = "Err init tesseract!";
    return ssRecText;//catch err here
}

ocr1->SetImage(CurImg.data, CurImg.cols, CurImg.rows, CurImg.channels(), static_cast<int>(CurImg.step));
ssRecText.append("\n\nTesseract direct:\n");
ssRecText.append(string(ocr1->GetUTF8Text()));

Sample output:

ContentsPropyleneGlycolVegetabie
GlycerolWaterItalian FlavoursNotsuitable
forpregnantorbreastfeedingwoman
NottobesoldtominorsKeepinadark
coolplaceKeepawayfrom children

Tesseract direct:
Contents: Propylene Glycol, Vegetabie
Glycerol, Water, Italian Flavours. Not suitable
for pregnant or breastfeeding woman.
Not to be sold to minors. Keep in a dark
cool place. Keep away from children.

Sample image: image description

Tesseract, Leptonica and Opencv was builded from source. Has anyone encountered such problems? Any ideas hove to fix it?

Tesseract accuracy

I try use Tesseract from Opencv, but the recognition accuracy is terrible compared to the direct use Tesseract.

Test code:

cv::Ptr<text::OCRTesseract> ocr = text::OCRTesseract::create(ssTessdata.c_str(), ssLang.c_str());
ocr->run(CurImg, ssRecText);
 tesseract::TessBaseAPI *ocr1 = new tesseract::TessBaseAPI();
 if (ocr1->Init(ssTessdata.c_str(), ssLang.c_str()) == -1) {
    ssRecText = "Err init tesseract!";
    return ssRecText;//catch err here
}

{//catch err}
ocr1->SetImage(CurImg.data, CurImg.cols, CurImg.rows, CurImg.channels(), static_cast<int>(CurImg.step));
ssRecText.append("\n\nTesseract direct:\n");
ssRecText.append(string(ocr1->GetUTF8Text()));

Sample output:

ContentsPropyleneGlycolVegetabie
GlycerolWaterItalian FlavoursNotsuitable
forpregnantorbreastfeedingwoman
NottobesoldtominorsKeepinadark
coolplaceKeepawayfrom children

Tesseract direct:
Contents: Propylene Glycol, Vegetabie
Glycerol, Water, Italian Flavours. Not suitable
for pregnant or breastfeeding woman.
Not to be sold to minors. Keep in a dark
cool place. Keep away from children.

Sample image: image description

Tesseract, Leptonica and Opencv was builded build from source. Has anyone encountered such problems? Any ideas hove to fix it?

Tesseract accuracy

I try use Tesseract from Opencv, but the recognition accuracy is terrible compared to the direct use Tesseract.

Test code:

cv::Ptr<text::OCRTesseract> ocr = text::OCRTesseract::create(ssTessdata.c_str(), ssLang.c_str());
ocr->run(CurImg, ssRecText);

tesseract::TessBaseAPI *ocr1 = new tesseract::TessBaseAPI();
if (ocr1->Init(ssTessdata.c_str(), ssLang.c_str()) == -1) {//catch err}
ocr1->SetImage(CurImg.data, CurImg.cols, CurImg.rows, CurImg.channels(), static_cast<int>(CurImg.step));
ssRecText.append("\n\nTesseract direct:\n");
ssRecText.append(string(ocr1->GetUTF8Text()));

Sample output:

ContentsPropyleneGlycolVegetabie
GlycerolWaterItalian FlavoursNotsuitable
forpregnantorbreastfeedingwoman
NottobesoldtominorsKeepinadark
coolplaceKeepawayfrom children

Tesseract direct:
Contents: Propylene Glycol, Vegetabie
Glycerol, Water, Italian Flavours. Not suitable
for pregnant or breastfeeding woman.
Not to be sold to minors. Keep in a dark
cool place. Keep away from children.

Sample image: image description

Tesseract, Leptonica and Opencv was build from source. source.

Has anyone encountered such problems? problem? Any ideas hove to fix it?

Tesseract accuracy

I try use Tesseract from Opencv, but the recognition accuracy is terrible compared to the direct use Tesseract.

Test code:

cv::Ptr<text::OCRTesseract> ocr = text::OCRTesseract::create(ssTessdata.c_str(), ssLang.c_str());
ocr->run(CurImg, ssRecText);

tesseract::TessBaseAPI *ocr1 = new tesseract::TessBaseAPI();
if (ocr1->Init(ssTessdata.c_str(), ssLang.c_str()) == -1) {//catch err}
ocr1->SetImage(CurImg.data, CurImg.cols, CurImg.rows, CurImg.channels(), static_cast<int>(CurImg.step));
ssRecText.append("\n\nTesseract direct:\n");
ssRecText.append(string(ocr1->GetUTF8Text()));

Sample output:

ContentsPropyleneGlycolVegetabie
GlycerolWaterItalian FlavoursNotsuitable
forpregnantorbreastfeedingwoman
NottobesoldtominorsKeepinadark
coolplaceKeepawayfrom children

Tesseract direct:
Contents: Propylene Glycol, Vegetabie
Glycerol, Water, Italian Flavours. Not suitable
for pregnant or breastfeeding woman.
Not to be sold to minors. Keep in a dark
cool place. Keep away from children.

Sample image: image description

Tesseract, Leptonica and Opencv was build from source.

Has anyone encountered such problem? Any ideas hove how to fix it?