I ran into a strange problem. I compiled the Windows version of Tesseract with mingw32-cmake, compiled the Linux version with cmake, and the code was the latest code from clone on GitHub. But the speed of Tesseract under Windows is three times slower than that of Linux, and I don't know why.
The CMake build for Cygwin and MinGW does not enable OpenMP.
@egorpugin
Try building with autotools.
I tried to link tess with OpenMP, but it is still so slowly, and how to enable OpenMP
FIND_PACKAGE(OpenMP REQUIRED)
SET (CMAKE_C_FLAGS "${CMAKE_C_FLAGS} ${OpenMP_C_FLAGS} -MP")
SET (CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} ${OpenMP_CXX_FLAGS} -MP")
@egorpugin
?
MSVC builds support openmp. I do not test cygwin/mingw.
MSVC builds support openmp.
I know :-) , but the OP wants to build using CMake for MinGW with OpenMP...
Any chance to use autotools there?
why can not enable OpenMP with cmake? I paste tess dll dependency built using CMake and enable OpenMP to below. @egorpugin I am fail to compile tess with autotools, I am trying to fix it.
[yu@localhost root]$ mingw-objdump -p bin/libtesseract40.dll | grep "DLL Name:"
DLL Name: WS2_32.dll
DLL Name: libgcc_s_sjlj-1.dll
DLL Name: libgomp-1.dll
DLL Name: KERNEL32.dll
DLL Name: libleptonica-1.76.1.dll
DLL Name: msvcrt.dll
DLL Name: libstdc++-6.dll
DLL Name: USER32.dll
I don't know how to turn on openmp for gcc (mingw).
Try this:
FIND_PACKAGE( OpenMP REQUIRED)
if(OPENMP_CXX_FOUND)
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} ${OpenMP_CXX_FLAGS}")
endif()
Have you installed mingw32-pthreads-w32?
Yes, still have no effect.
@egorpugin Does Tesseract you built have approximate speed in Linux and Windows?
Did not test.
@napasa, could you please provide more details on the performance test which you have run? Which image and command line did you use? Which traineddata did you use? Did you use multithreading? What were the results? Did you also try https://digi.bib.uni-mannheim.de/tesseract/tesseract-ocr-w64-setup-v4.0.0-beta.1.20180608.exe?
The test results of the same file under win 7 with different compiled versions of Tesla are as follows. The traineddata I used is chi_sim tessdata_fast. the commit:b1f7990d9b version is compiled by myself on fedora used Mingw64 with command " Mingw{bit}-cmake .. && Mingw{bit}-make"
mannheim 64bit package:9s
tesseract v4.0.0-beta.1.20180608
leptonica-1.76.0
libgif 5.1.4 : libjpeg 8d (libjpeg-turbo 1.5.3) : libpng 1.6.34 : libtiff 4.0.9 : zlib 1.2.11 : libwebp 0.6.1 : libopenjp2 2.2.0
commit:b1f7990d9b 64bit:47s
tesseract 4.0.0-beta.1
leptonica-1.74.4
libgif 5.1.4 : libjpeg 6b (libjpeg-turbo 1.5.1) : libpng 1.6.29 : libtiff 4.0.8 : zlib 1.2.8 : libwebp 1.0.0
Found AVX2
Found AVX
Found SSE
commit:b1f7990d9b 32bit:57s
tesseract 4.0.0-beta.1
leptonica-1.76.1
libpng 1.6.34 : libtiff 4.0.9 : zlib 1.2.11 : libopenjp2 2.3.0
Found AVX2
Found AVX
Found SSE
@stweil
mannheim 64bit package:9s
Autotools was used to build this package. Autotools activates OpenMP correctly for MinGW.
Tesseract with OpenMP uses up to 4 threads, so a factor in the range 3 to 4 is to be expected between an executable without OpenMP (slow) and with OpenMP (3.x times faster).
9 seconds for a page of text looks like a good value and shows that Tesseract on Windows is normally fast.
@napasa : did you solve this issues? Is there something we need/can add to cmake configuration?
Hi,
I compiled tesseract-4.0.0 under Windows following wiki "Compiling" section, using cppan and "vcpkg install tesseract:x64-windows-static".
But my app, which calls tesseract, already is multi-threaded and I would like to disable openMP. I see a lot of context switches with perfmon. Speed of my app with tesseract-4.0.0 is ~35% slower compared with tesseract 3.02, which causes much less context switches.
So I would like to compile and test tesseract without openMP. I've already tried setting env var OMP_THREAD_LIMIT=1, but with no success, each tesseract process uses up to 3 cpu cores.
Anyone knows how to set --disable_openmp with vcpkg? I think it is running .\configure automatically.
Thanks in advance
ok I had success. I do not know exactly what change disabled openMP, but I commented "/openmp" in CMakeLists.txt and cppan.yml files.
Now tesseract-4.0.0 uses only 1 cpu core and my multithreaded app runs 33% faster, with less context switches, a bit faster than with tesseract 3.02. And with much better OCR results!
I do not know why setting OMP_THREAD_LIMIT=1 did not worked.
I hope this helps.
Most helpful comment
ok I had success. I do not know exactly what change disabled openMP, but I commented "/openmp" in CMakeLists.txt and cppan.yml files.
Now tesseract-4.0.0 uses only 1 cpu core and my multithreaded app runs 33% faster, with less context switches, a bit faster than with tesseract 3.02. And with much better OCR results!
I do not know why setting OMP_THREAD_LIMIT=1 did not worked.
I hope this helps.