Tesseract Version: v5.0.0-alpha.20200328 (pre-built binaries)
Platform: Windows 64bit
Hi, i am using this command:
tesseract my_image my_output_path --oem 0 --dpi 96 --psm 6 digits
to perform OCR over a image which has vertical digits. I can do it successfully with --oem 1, but with --oem 0 the following error apperas:
Error: Tesseract (legacy) engine requested, but components are not present in C:\Program Files\Tesseract-OCR/tessdata/eng.traineddata!!
Failed loading language 'eng'
Tesseract couldn't load any languages!
Could not initialize tesseract.
I have C:\Program Files\Tesseract-OCR in PATH and C:\Program Files\Tesseract-OCR/tessdata/ in TESSDATA_PREFIX.
The command:
tesseract --list-langs
does list me english:
ara-amiri-3000
brah
digits
digits1
digits_comma
digits_layer
digitsall_layer
dotslayer
eng
engmorse
engrestrict_best
engrestrict_best_int
fas-minus-float
fas-plus-float
fas-script-float
fas-tune-float
frk
mya430000
osd
san-chandas-float
san-siddhanta-float
and I have the file eng.traineddata in tessdata.
I expected this works OK. Maybe I can't use it on this way?
Just need to know if I can use this command and how, thank you so much!
Please respect guidelines for posting issue: use tesseract user forum for asking questions/support.
Make sure you read docs first, search forum, issue tracker and internet a little bit.
First of all, respect me because I asked in a correct way. I found the solution yep, diving into other issues but it was not obvious as other users are also asking the same. Maybe I should internet a litte bit, but you should be clearer in the docs 'cause maybe it is not our problem. ;)
@anavc94 What's the solution?
I am facing the same issue.
While @zdenop might be correct that the forum is the correct place to check, it's not unusual for people to find a post like this on Google. So simply saying 'go check the forum' doesn't gain any moral high ground and isn't helping anyone. At least post the fix or a link to it else more people will keep replying here or creating new issues.
@wazcov, you are right. But the problem is simply that there are only a handful of people (all volunteers!) who try to fix issues in the code in their spare free time. And there are thousands of people who work with the software, some having problems. If all who have problems ask here for help, the few volunteers will no longer be able to contribute code fixes or new features, and the project is dead. So 'go check the forum' is simply an appell to try hard to find a solution yourself or from a larger community instead of blocking the whole project.
Therefore:
Please use the Tesseract user forum for questions.
Most helpful comment
While @zdenop might be correct that the forum is the correct place to check, it's not unusual for people to find a post like this on Google. So simply saying 'go check the forum' doesn't gain any moral high ground and isn't helping anyone. At least post the fix or a link to it else more people will keep replying here or creating new issues.