Tesseract: Tesseract can't use traineddata

Created on 19 Mar 2020  路  5Comments  路  Source: tesseract-ocr/tesseract

I have train tesseract using tesstrain and I got a traineddata file from it, I then copied the said file to /usr/local/share/tessdata/

But when i tried to extract the text from the image i used to for training i got this error

Error: Tesseract (legacy) engine requested, but components are not present in /usr/local/share/tessdata/bar.traineddata!!
Failed loading language 'bar'
Tesseract couldn't load any languages!
Could not initialize tesseract.

here's the command i use

tesseract data/bar-ground-truth/alexis_ruhe01_1852_0018_022.tif stdout -l bar

tesseract version is

tesseract 5.0.0-alpha-635-g90405
 leptonica-1.78.0
  libgif 5.1.4 : libjpeg 8d (libjpeg-turbo 1.4.2) : libpng 1.2.54 : libtiff 4.0.6 : zlib 1.2.8 : libwebp 0.4.4 : libopenjp2 2.3.0
 Found AVX2
 Found AVX
 Found SSE
 Found OpenMP 201307

here's /usr/local/share/tessdata directory

drwxr-xr-x 4 root root     4096 Mar 19 17:11 ./
drwxr-xr-x 8 root root     4096 Mar  5 16:48 ../
-rwxr-xr-x 1 root root     2364 Mar 19 17:11 bar.traineddata*
drwxr-xr-x 2 root root     4096 Mar  5 16:48 configs/
-rwxr-xr-x 1 root root 15400601 Mar  6 15:27 eng.traineddata*
-rw-r--r-- 1 root root      572 Mar  5 16:48 pdf.ttf
drwxr-xr-x 2 root root     4096 Mar  5 16:48 tessconfigs/

as you can see, bar.traineddata is in the said directory but it can't seems to use it
what could possibly wrong with it?

Most helpful comment

figured it out

I copied the traineddata in the data/bar/bar.traineddata not the one generated in the /data/bar.traineddata

All 5 comments

tesseract --list-langs gives

List of available languages (3):
bar
eng
foo

I've also tried adding OCR options --oem 0 to 3, but none of them works

figured it out

I copied the traineddata in the data/bar/bar.traineddata not the one generated in the /data/bar.traineddata

@royudev
I first tried copying the /data/lang.traineddata to /usr/local/share/tessdata , however
it said it Could not found lstm dictionaries.
Then i tried copying the /data/data/lang.traineddata , but it gave the same error as yours ,
so please can you elaborate how exactly you solved it ?

, however
it said it Could not found lstm dictionaries.

If you did not give a wordlist during training then the lstm dictionary will not be there. It is not a required item and you should be able to use the traineddata for recognition without it.

@Shreeshrii I havn't given a wordlist during training , however i am still not able to follow why it is showing me the errors , can you please look at this error

Was this page helpful?
0 / 5 - 0 ratings

Related issues

MerlijnWajer picture MerlijnWajer  路  7Comments

mm-manu picture mm-manu  路  4Comments

garry-ut99 picture garry-ut99  路  5Comments

clarkk picture clarkk  路  7Comments

reubano picture reubano  路  6Comments