When calling the "api.GetUTF8Text()" I receive a SIGSEV, after the log:
Warning. Invalid resolution 0 dpi. Using 70 instead.
Error: Illegal Parameter specification!
A fatal error has been detected by the Java Runtime Environment:
"Fatal error encountered!" == nullptr:Error:Assert failed:in file globaloc.cpp, line 75
So it seems tesseract is invoked but then seems to have a problem with a parameter. (which one?)
My config:
<dependency>
<groupId>org.bytedeco.javacpp-presets</groupId>
<artifactId>tesseract-platform</artifactId>
<version>4.0.0-beta.3-1.4.2</version>
</dependency>
bash/:~$ tesseract -v
tesseract 4.0.0-beta.1
leptonica-1.75.3
libgif 5.1.4 : libjpeg 8d (libjpeg-turbo 1.5.2) : libpng 1.6.34 : libtiff 4.0.9 : zlib 1.2.11 : libwebp 0.6.1 : libopenjp2 2.3.0
Found AVX
Found SSE
OS: Ubuntu 18.04
Any help highly apprceiated - thanx!
Does the sample code in the README.file work?
https://github.com/bytedeco/javacpp-presets/tree/master/tesseract#sample-usage
Yeah, that's a known issue:
https://github.com/tesseract-ocr/tesseract/issues/1010#issuecomment-320521746
You'll need to set your locale to something else to make Tesseract happy.
Thanks for directing me to the example, that was helpful.
It turned out that the example is runnable indeed, and then I could boil it down to that line:
api.SetPageSegMode(1);
mode 0 and 1 lead to a core dump, 2 or higher works.
I am not sure if this has something to do with the locale (tesseract internally), since I tried out a few variants, with no change to the problems. Additonally, the problem exists both on my workstation (locale de_AT) and the server (locale en_US).