Tesseract: Tesseract creates PDF documents with "GlyphLessFont " font Using 4.0

Created on 12 Apr 2019  ·  12Comments  ·  Source: tesseract-ocr/tesseract

Hi ,

I am working on tesseract, converting tiff to searchable PDF. Some files are converting right but some file having "GlyphLessFont" font . Can I resolved this issue. ?

I am using Tesseract Open Source OCR Engine v4.0.0.20181030 with Leptonica Version of Tesseract.

input.zip
output.zip

Command I am Using
tesseract input.tiff output -l eng pdf

Please help if possible.

All 12 comments

ALL pdf files created with tesseract are created with GlyphLessFont font.
So what do you mean by "Some files are converting right"?

My pdf Reader is not able to read GlyphlessFont Font. Can I change this font into "TimeNewRomans"??

What do you mean by "not able to read GlyphlessFont Font"? What reader?

And no: you are not able to replace font - this is hardcoded in tesseract. Searchable pdf means: show original image with invisible text.

So I closing this issues as working as designed.

But some of the files and I am getting TimeNewRomans. In some places, I am getting "GlyphLessFont" font.

On attached file
817) 877-8855 is converting with "TimesNewRomans" and other converted with "GlyphLessFont".

input.zip
out.zip

I do not know what you mean with:

817) 877-8855 is converting with "TimesNewRomans"

E.g. Acrobat reader states there is GlyphLessFont font ONLY in this documen, so there ican not be TimesNewRomans used:

image

Hi, I am using aspose.pdf on the parsing of this file, I am getting "817) 877-8855" with "TimesNewRomans" And remaining words are "GlyphLessFont".

Can you provide "GlyphLessFont" file for Windows 10? . So I can install in my system.

So it is issue of aspose.pdf and not tesseract.
All font used in pdf must be embed in pdf. There is no reason to install it for reading pdf.

Can you provide "GlyphlessFont" font file for windows 10?

GlyphlessFont is part of tesseract instalation.

That is ok but I need the Font ("GlyphlessFont ") file so that font file is helpful for us.

https://github.com/tesseract-ocr/tesseract/raw/master/tessdata/pdf.ttf
https://github.com/tesseract-ocr/tesseract/blob/master/tessdata/pdf.ttf

On Mon, Apr 15, 2019 at 12:14 PM dharam116 notifications@github.com wrote:

That is ok but I need the Font ("GlyphlessFont ") file so that font file
is helpful for us.


You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
https://github.com/tesseract-ocr/tesseract/issues/2385#issuecomment-483127222,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AE2_o__9mdO_-UG0gQSxJyCyEM-dbe7Zks5vhB-wgaJpZM4crbib
.

--


भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com

Hi ,

I am getting an error when I am trying to install this font.

Screenshot (32)

Was this page helpful?
0 / 5 - 0 ratings