Tesseract: TensorFlow and Tesseract

Created on 24 Apr 2017  路  13Comments  路  Source: tesseract-ocr/tesseract

Tesseract 4.00 can be used with tensorflow in some way.

@theraysmith,
This feature needs Documentation - How to use it?

feature request

Most helpful comment

@SragAR How can i use it with tensorflow? It's just wrapper for tesseract, I don't see anything implemented to work with tensorflow.

All 13 comments

Would like to know this as well

anyone have clue for this? tensorflow is well optimized for CPUs, GPUs and TPUs especially after releasing XLA compiler, would be nice to use tesseract with it.

As far as I know Tesseract currently cannot be linked with Tensorflow.

I'm using TF and tesseract in the same project but seperately, would like to know the answer too.

You can use tesseract and tensorflow together in Python.
Use pytesseract module.
https://github.com/madmaze/pytesseract

@SragAR How can i use it with tensorflow? It's just wrapper for tesseract, I don't see anything implemented to work with tensorflow.

-The model in tesseract is exactly same as with this one, or modified version of it?
https://github.com/tensorflow/models/tree/master/research/street

-I am looking for a way to do training in tensorflow, maybe try a few other models like crnn etc
Is the easiest way to do this (if using another model), using tensorflow C++ API directly right?

-The model in tesseract is exactly same as with this one, or modified version of it?
https://github.com/tensorflow/models/tree/master/research/street

-I am looking for a way to do training in tensorflow, maybe try a few other models like crnn etc
Is the easiest way to do this (if using another model), using tensorflow C++ API directly right?

Anyone can answer ?

probably its the same infrastructure code though with a different network setup.
https://github.com/tensorflow/models/blob/master/research/street/python/vgsl_train.py
for tesseract see here:
https://github.com/tesseract-ocr/tesseract/wiki/VGSLSpecs

and somewhere in issues or mail group, there is very detailed info about actual parameters used for each language during training.

The Tesseract code has some conditional parts which depend on macro INCLUDE_TENSORFLOW, so it is prepared to be compiled with Tensorflow. That requires a separate installation of Google's protocol buffers and Tensorflow and much patience to hack the build environment until it works. After spending several hours on this I am still not successful.

See also Ray's comment in the source code:

// TODO(rays) How to make this usable both in Google and open source?

as far i know, it has an custom OP to handle variable size height dimension, if you want to use it you need to compile tensorflow from sources last time i checked, you cant load an custom OP with precompiled tensorflow. (however i am not sure this is still the case)
compiling tf is not hard with ubuntu, but in windows it was giving some headaches, though currently it might have less issues with bazel version. I havent compiled the OP though.

Now Tesseract can be built with TensorFlow (see pull request #2461).

Was this page helpful?
0 / 5 - 0 ratings