I have installed Tesseract using conda-forge:
tesseract 4.1.0
leptonica-1.78.0
libgif 5.1.4 : libjpeg 9d : libpng 1.6.37 : libtiff 4.1.0 : zlib 1.2.11 : libwebp 1.1.0 : libopenjp2 2.3.1
Found AVX2
Found AVX
Found SSE
I am using pytesseract[python wrapper] to create a solution which is using conda tesseract. I am working on Linux 4.9.0-9-amd64 #1 SMP Debian 4.9.168-1 (2019-04-12) x86_64 GNU/Linux with 7 cores
Everytime I run tesseract, it is using all the cores. From the documentation, I understand that parameter tessedit_parallelize 0, specifies use of cores and os.environ['OMP_THREAD_LIMIT'] helps controlling the threads count.
Currently, the time taken by tesseract to produce result for one image is 3s. I wanted to improve this, but I am unable to change the value tessedit_parallelize. I tried to find this parameter in /usr/local/share/tessdata/configs but couldn't find it anywhere.
Can anyone please explain how to modify this parameter for tesseract 4.1.0. Also, is there a provision to pass this value from pytesseract or set it to a default value?
Also, How do i change os.environ['OMP_THREAD_LIMIT'] value to improve my latency