Tesseract: The text is not recognized from a png

Created on 19 Jun 2015 · 21Comments · Source: charlesw/tesseract

I have this imagine, but the tesseract doesn't recognize the text from the imagine

banner2

The output after running the tesseract is:

Ammmz e um

Bzndmary Pbfﬁularamr
ugsmmm gmmm
Rzﬁaume P3yMuiR:6aua
Stams Pay hefare 20‘ arnsrzz

question

Source

FlorinMax

All 21 comments

Not sure if your already doing some preprocessing however this might help. Issue #115 describes some techniques which might be helpful. You can also enable the tessedit_write_images option (fixed by issue #160) to see exactly what image is being fed into tesseract (tesseract does some pre-processing itself). Finally specific to your example I'd do at least the following as a starter:

Resize image to be 300dpi
Set the region of interest to only the fields component\panel (assumes layout is fixed)
Consider updating the word and pattern dictionaries to support the possible form field values (see https://code.google.com/p/tesseract-ocr/wiki/ImproveQuality#Dictionaries,_word_lists,_and_patterns).

charlesw on 21 Jun 2015

I tried to follow your steps:
I resized the image, crop the image (a small part of it), apply a grayscale and set the variables (I cannot set the ' tessedit_write_images ' to true), my method failed to retrieve value for tessedit_write_images . So I post the code, maybe is something wrong in the code.

The image cropped:
spscale
After that, this is the result: , but is not enough
Amount
Beneﬁcwary
Dzscnmmn
Reﬁerenoz
scams

This is my code:
public void CropImage()
{
Bitmap image = new Bitmap(localPath);
// var rect = new Rectangle(130,10,125,70);
var rect = new Rectangle(60,10,70,70);
Bitmap imageCrop = image.Clone(rect, image.PixelFormat);
imageCrop.SetResolution(300, 300);
Graphics g = Graphics.FromImage(imageCrop);
g.DrawImage(imageCrop, 0, 0);
g.Dispose();
imageCrop.Save(localPathCrop);
}

    public void GrayScaleImage()
    {
        Bitmap c = new Bitmap(localPathCrop);
        Bitmap d;
        int rgb;

        for (int y = 0; y < c.Height; y++)
            for (int x = 0; x < c.Width; x++)
            {
                Color pixelColor = c.GetPixel(x, y);
                rgb = (int)((pixelColor.R + pixelColor.G + pixelColor.B)/3);
                c.SetPixel(x, y, Color.FromArgb(rgb, rgb, rgb));
            }
        d = c;
        d.Save(localPathGrayScale);
    }

public void readOCR()
{
var pathToLangFolder = @"D:\Automation Tests\OCRTest\Tesseract-OCR";

        using (var engine = new TesseractEngine(pathToLangFolder, "eng", EngineMode.Default))
        {
            engine.SetVariable("load_system_dawg", false);
            engine.SetVariable("load_freq_dawg", false);
            engine.SetVariable("tessedit_write_imag", true);
            bool result ;

            if (engine.TryGetBoolVariable("tessedit_write_imag", out result))
            {
                Assert.AreEqual(false, result, "The values are not equal");
            }
            else
            {
                Assert.Fail("Failed to retrieve value for '{0}'.", "tessedit_write_imag");
            }

            using (Bitmap image = new Bitmap(localPathGrayScale))
            {
                using (var pix = PixConverter.ToPix(image))
                {
                    using (var page = engine.Process(pix))
                    {
                        Console.WriteLine(page.GetMeanConfidence() + " : " + page.GetText());
                    }
                }
            }
        }
    }

FlorinMax on 24 Jun 2015

👍1

In regards to converting the image to grayscale the actual formula is 0.2126 * R + 0.7152 * G + 0.0722 * B (https://en.wikipedia.org/wiki/Grayscale) however Pix exposes this functionality through the Pix.ConvertRGBToGray (set all parameters to 0 to use the default values defined by leptonica). Though I wouldn't bother with this unless your doing further processing that requires a grayscale image (like running it through a custom thresholder\binerization algorithm). Note that Tesseract already does this and it's generally considered good enough for most cases.

I'm also pretty sure your not correctly resizing the image to 300dpi in your crop function. If you check the output I think you'll find that it's actually the same size. What you'll need to do is us the source resolution and the work out a scaling factor from that. So assuming the source is 70 dpi (typical screen resolution) something like the following should work:

public static Bitmap ResizeImage(Bitmap src, Single targetResolution)
{
        if(targetResolution <= 0.0f) throw new ArgumentOutOfRangeException ("targetResolution", "The target resolution must be greater than zero.");

        if(src.HorizontalResolution <= 0.0f) throw new ArgumentOutOfRangeException ("src", "The src image doesn't specify a horizontal resolution.");

        if(src.VerticalResolution<= 0.0f) throw new ArgumentOutOfRangleException("src", "The src image doesn't specify a vertical resolution.");

        Single horizontalScale = targetResolution / src.HorizontalResolution;
        Single verticalScale = targetResolution / src.VerticalResolution;

        Bitmap result = new Bitmap(src.Width * horizontalScale , src.Height * verticalScale);
        b.SetResolution(targetResolution, targetResolution )
        using (Graphics g = Graphics.FromImage((Image)b))
        {
            g.InterpolationMode = System.Drawing.Drawing2D.InterpolationMode.HighQualityBicubic;
            g.DrawImage(src, 0, 0, result .Width  , result.Height);
        }
        return b;
}

Finally the code bellow will always fail as you've just set tessedit_write_images to true (also note it's tessedit_write_images not tessedit_write_imag :

engine.SetVariable("tessedit_write_images", true);
if (engine.TryGetBoolVariable("tessedit_write_images", out result))
{
    Assert.AreEqual(false, result, "The values are not equal");
}

What you probably want is something like this:

engine.SetVariable("tessedit_write_images", true);
if (engine.TryGetBoolVariable("tessedit_write_images", out result))
{
    Assert.AreEqual(true, result, "The variable 'tessedit_write_images' should be enabled.");
}

charlesw on 24 Jun 2015

👍1

Note I've created an issue, #183, to support resizing\scaling Pix's as its probably a common operation. No promises that it'll be implemented anytime soon.

charlesw on 24 Jun 2015

Related to ResizeImage method, I thonk, instead of b.SetResolution(targetResolution, targetResolution )
we should have src.SetResolution(targetResolution, targetResolution ) ? and also I encountered an error on Bitmap result = new Bitmap(src.Width * horizontalScale , src.Height * verticalScale);
Argument 1 and 2 cannot convert from float to string

FlorinMax on 24 Jun 2015

Opps sorry should have been result.SetResolution(targetResolution, targetResolution) and Bitmap result = new Bitmap((int)(src.Width * horizontalScale),(int)(src.Height * verticalScale))

charlesw on 24 Jun 2015

After I made your changes, this is the result, but is still not what I expected
Amount
Beneﬁcxary
Descnvhun
Rekreno:
Status

FlorinMax on 24 Jun 2015

And this is the CropImage method after your changes
public void CropImage()
{
Bitmap image = new Bitmap(localPath);
var rect = new Rectangle(60,10,70,70);
Bitmap imageCrop = image.Clone(rect, image.PixelFormat);
ResizeImage(imageCrop, 300);
imageCrop.Save(localPathCrop);

FlorinMax on 24 Jun 2015

Umm sorry I'm out of ideas might just not be a high enough quality image.
Maybe try stackoverflow if you have not already?

charlesw on 25 Jun 2015

Before you fix the issue, do you have any idea to step over this ?

FlorinMax on 25 Jun 2015

You know what it's funny, using the same tesseract library in Java, it works fine. I don't have to crop the image, just scale it.

FlorinMax on 25 Jun 2015

What happens if you use the tesseract command line tool?
On 25 Jun 2015 21:21, "FlorinMax" [email protected] wrote:

You now what it's funny, using the same tesseract library in Java, it
works fine. I don't have to crop the image, just scale it.

—
Reply to this email directly or view it on GitHub
https://github.com/charlesw/tesseract/issues/182#issuecomment-115214245.

charlesw on 26 Jun 2015

I come with some updates. After looking to find the issue, I found what was the problem. Our method to Resize the image is not doing what we expect. Basically , the method doesn't resize the image, it draws with the same resolution.
So, instead of :
/Bitmap result = new Bitmap((int)(src.Width * horizontalScale) , (int)(src.Height * verticalScale));
//result.SetResolution(targetResolution, targetResolution);

I added :
int width = (int)(src.Width * horizontalScale);
int height = (int)(src.Height * verticalScale);
Bitmap result = new Bitmap(src, width, height);

After this, our image get a higher resolution:
Dimensions 1334 x 375
Width 1334 pixels
Height 375 pixels
Bit depth 32
and I get all the text from the image.

As I said in the previously comments, in Java , using AffineTransform, I get an image with better resolution:
Dimensions 640 x 180
Width 640
Height 180 pixels
Bit depth 24

Trying to obtain the same as with VS , the text is not recognized completely, so I have to give the maxim targetresolution.

In conclusion, not the Tesseract was the problem, our resize method was the problem, and I think is not fully optimized.

FlorinMax on 26 Jun 2015

Okay, I'll see if I can find some time this weekend to expose the resize
functionality offrred by leptonica. Should solve these kinds of issues.
On 26 Jun 2015 23:54, "FlorinMax" [email protected] wrote:

I come with some updates. After looking to find the issue, I found what
was the problem. Our method to Resize the image is not doing what we
expect. Basically , the method doesn't resize the image, it draw with the
same resolution.
So, instead of :
/Bitmap result = new Bitmap((int)(src.Width * horizontalScale) ,
(int)(src.Height * verticalScale));
//result.SetResolution(targetResolution, targetResolution);

I added :
int width = (int)(src.Width * horizontalScale);
int height = (int)(src.Height * verticalScale);
Bitmap result = new Bitmap(src, width, height);

After this, our image get a higher resolution:
Dimensions 1334 x 375
Width 1334 pixels
Height 375 pixels
Bit depth 32
and I get all the text from the image.

As I said in the previously comments, in Java , using AffineTransform, I
get an image with better resolution:
Dimensions 640 x 180
Width 640
Height 180 pixels
Bit depth 24

Trying to obtain the same as with VS , the text is not recognized
completely, so I have to give the maxim targetresolution.

In conclusion, not the Tesseract was the problem, our resize method was
the problem, and I think is not fully optimized.

—
Reply to this email directly or view it on GitHub
https://github.com/charlesw/tesseract/issues/182#issuecomment-115698001.

charlesw on 27 Jun 2015

I've added a new Scale method to Pix which should work better for the use case. Can you get the latest source code and try it out? You can build a NuGet package by double clicking the build.bat file.

charlesw on 27 Jun 2015

Where should I found the build.bat file ? I have to uninstall the tesseract orc from NuGet Pacages and reinstall it again ?

FlorinMax on 30 Jun 2015

Anyway... the behavior of the library is very strange. Some times it recognizes all the characters and numbers, some time not. Has some difficulty to recognize numbers. for this, I have to play (increase/ decrease) with targetresolution to get the text from the image. I saw that the date is the most difficult to recognize from the image.

FlorinMax on 30 Jun 2015

No, just checkout the source, develop branch, and run ~\build.bat it will
generate a nuget package that you can then use by adding to a local nuget
repo.

charlesw on 30 Jun 2015

Sorry, but I am not so familiar with this. Maybe you can give me more details...

FlorinMax on 30 Jun 2015

In Tesseract -master, I found a build.bat file.... this is the one ?

FlorinMax on 30 Jun 2015

Yes, however you'll need to change the brach to develop. Master only
contains released code.
On 30/06/2015 9:29 pm, "FlorinMax" [email protected] wrote:

In Tesseract -master, I found a build.bat file.... this is the one ?

—
Reply to this email directly or view it on GitHub
https://github.com/charlesw/tesseract/issues/182#issuecomment-117134882.

charlesw on 30 Jun 2015

Was this page helpful?

0 / 5 - 0 ratings