Tesseract: Windows (x64) compiled executable (tesseract.exe) doesn't work in any other windows machine (x64)

Created on 24 Sep 2019  路  14Comments  路  Source: tesseract-ocr/tesseract

Environment

  • Tesseract Version: tesseract 5.0.0-alpha-375-g179c8
  • Platform: Windows Server 2008 R2, 64bit

Current Behavior:

I built tesseract executable from source using following this instruction

  1. git clone https://github.com/tesseract-ocr/tesseract tesseract
  2. Modify source code (this is to suit our application's purpose, certain that this is NOT the root cause of issue)
  3. cd tesseract
  4. cppan
  5. mkdir build && cd build
  6. cmake ..
  7. cmake --build . --config Release

After step 6, I see tesseract.exe under "tesseract\build\bin\Release" directory, and executing it without any issue (working perfectly fine).

However, if once I copy entire \Release directory to another Windows with x64 platform (Windows Server 2008, 2016, etc..) It doesn't work anywhere else. (see screenshots below)

I have installed "Microsoft Visual C++ 2015-2019 Redistributable (x64)", so dependency does not look like issue here.

Screenshot 1 (Installed runtime libraries)
2019-09-24 12_30_06-VirtualBoxVM

Screenshot 2 (Message box pop-up on different Windows machine)
2019-09-24 12_30_56-VirtualBoxVM

Screenshot 3 (Windows Event Viewer)
2019-09-24 12_32_38-VirtualBoxVM

Screenshot 4 (Windows Event Viewer) - FYI, disk is not on network, it's under C:
2019-09-24 12_32_23-VirtualBoxVM

Can anyone suggest me with a fix? is there anything I'm missing during compilation procedure?

Suggested Fix:

bug build_process

Most helpful comment

CMake builds are currently optimized for the build host architecture. So if you build on a PC with AVX2 hardware, the resulting binaries won't work (= will crash) on a PC without AVX2.

Either fix cmake/OptimizeForArchitecture.cmake or remove it from CMakeLists.txt.

All 14 comments

CMake builds are currently optimized for the build host architecture. So if you build on a PC with AVX2 hardware, the resulting binaries won't work (= will crash) on a PC without AVX2.

Either fix cmake/OptimizeForArchitecture.cmake or remove it from CMakeLists.txt.

Builds using autoconf will work.

@stweil Thanks for your suggestions.
Regarding your recommendation CMakeLists.txt, I'm compiling without following section in CMakeLists.txt and it seems to work on other Windows machine.

# auto optimize
# include(OptimizeForArchitecture)
#AutodetectHostArchitecture()
# OptimizeForArchitecture()
#foreach(flag ${Vc_ARCHITECTURE_FLAGS})
#     set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} ${flag}")
# endforeach()

When you mentioned build using autoconf, can you instruct me how to do this? I couldn't find autoconf usage under HowToBuild (Windows) instruction page.

With disabling optimization OCR process will be slow.
Autotools macros are common for linux and using them on Windows is very difficult.
Other option is to use cross-compiled windows build, but IMO it is not suitable for using tesseract API (library) on windows (VS Studio).

With disabling optimization OCR process will be slow.
Autotools macros are common for linux and using them on Windows is very difficult.
Other option is to use cross-compiled windows build, but IMO it is not suitable for using tesseract API (library) on windows (VS Studio).

Do you have any guess on what performance degradation would be without optimization ? (I will need to test performance anyways but was just curious if you can make any rough guess here)

Using cross-compiled windows build would be a problem for me, because I needed to changed the source code.

With image from issue 263 I got this results (tesseract 5.0.0-alpha-447-g52cf, Intel Core i7-6600U 2,60GHz 2.8 GHz; 8 GB ram; Windows 64 bit) - 5 runs:

With autooptimization: 26.177767169865813
Without autooptimization: 50.47452997387488

Update: I posted more details in #263

Now you can turn off cmake autooptimization with:
cmake .. -DAUTO_OPTIMIZE=OFF

@zdenop, did you specify CMAKE_BUILD_TYPE for your builds?

If not: please try latest Git master. It should be significantly faster.

@stweil: I used cmake&ninja&clang and this combinations needs to specify CMAKE_BUILD_TYPE. So yes, I specify it (to Release).

@zdenop Hi, thanks for testing the performance with optimization on/off.

Update: I posted more details in #263

In your post, you mentioned

  • tessdata_best
  • tessdata_fast
  • tessdata

Does this mean if I use eng.traineddata from each git, OCR engine will perform differently based on trained data file? In other words, if I swap exsiting eng.traineddata with the one from "tessdata_fast", would it increase performance?

After reading some doc, it looks like eng.traineddata has effect when using LSTM engine. I don't explicitly specify engine mode when executing. is LSTM used by default?

if I swap existing eng.traineddata with the one from "tessdata_fast", would it increase performance?

I am not sure how do you measure performance, but tessdata_fast model is faster version of tessdata_best. It has some limitations (e.g. you can do training only from best model, and best model provide better OCR result - at least in some cases).

is LSTM used by default?

>tesseract --help-oem
OCR Engine modes:
  0    Legacy engine only.
  1    Neural nets LSTM engine only.
  2    Legacy + LSTM engines.
  3    Default, based on what is available.

what is available => tessdata_best and tessdata_fast have only LSTM models (you will get error if you will use --oem 0: Failed loading language 'eng' Tesseract couldn't load any languages! Could not initialize tesseract), tessdata has also legacy model.

@zdenop Thanks, that makes sense.

Quick question regarding compilation (will create a new ticket if needed)

I'm using cmake to compile (shown below), which generate 64-bit executable. Is there any way to build 32-bit env executable?

:: Compile Tesseract
cd build\tesseract_src
cppan

cd ..
cmake tesseract_src
cmake --build . --config Release

Can I achieve all the same behavior above by using msbuild ? (Visual Studio compilation)

  1. IMO original topic is solved, so I will close this issue.
  2. Issue tracker is for bug and not for support. Please use forum or better - read cppan doc. It is external project.

Got it. Thanks for the help.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

clarkk picture clarkk  路  3Comments

YeisonVelez11 picture YeisonVelez11  路  5Comments

Shreeshrii picture Shreeshrii  路  4Comments

johnthagen picture johnthagen  路  6Comments

clarkk picture clarkk  路  6Comments