What are some good resources for arabic OCR in the wild dataset?
Hello there, I've recently started working on a OCR in the wild algorythm using neural networks. My requirements are as follow: Arabic text, Natural images(not scans etc.)
My goal is detecting weather the image has text or not and then extract the text.
I need some help from you, I need large dataset. If there's any, it would be great, otherwise, I would appreciate some help thinking of reasonable methods to create such dataset by my own.
Thank you very much, A Dylan
On Ubuntu, if you hit
sudo apt-get install tesseract-ocr
and then hit tab, you can see a range of available language models for tesseract OCR system.Hey, I try it , i got some error when running the following command:
tesseract photo.jpeg out -l ara
(I installed the language package) The error is:I guess you will need to address this as an issue at the tesseract github, to get better support!