Tesseract: Unable to find tessedit_parallelize and OMP_THREAD_LIMIT in /usr/local/share/tessdata/configs

Created on 28 May 2020  路  1Comment  路  Source: tesseract-ocr/tesseract

  • I have installed Tesseract using conda-forge:
    tesseract 4.1.0
    leptonica-1.78.0
    libgif 5.1.4 : libjpeg 9d : libpng 1.6.37 : libtiff 4.1.0 : zlib 1.2.11 : libwebp 1.1.0 : libopenjp2 2.3.1
    Found AVX2
    Found AVX
    Found SSE
  • I am using pytesseract[python wrapper] to create a solution which is using conda tesseract. I am working on Linux 4.9.0-9-amd64 #1 SMP Debian 4.9.168-1 (2019-04-12) x86_64 GNU/Linux with 7 cores
    Everytime I run tesseract, it is using all the cores. From the documentation, I understand that parameter tessedit_parallelize 0, specifies use of cores and os.environ['OMP_THREAD_LIMIT'] helps controlling the threads count.
    Currently, the time taken by tesseract to produce result for one image is 3s. I wanted to improve this, but I am unable to change the value tessedit_parallelize. I tried to find this parameter in /usr/local/share/tessdata/configs but couldn't find it anywhere.
    Can anyone please explain how to modify this parameter for tesseract 4.1.0. Also, is there a provision to pass this value from pytesseract or set it to a default value?
    Also, How do i change os.environ['OMP_THREAD_LIMIT'] value to improve my latency
question

>All comments

The right place to ask questions about Tesseract usage is our forum.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

Shreeshrii picture Shreeshrii  路  4Comments

YeisonVelez11 picture YeisonVelez11  路  5Comments

garry-ut99 picture garry-ut99  路  5Comments

royudev picture royudev  路  5Comments

reubano picture reubano  路  6Comments