Tesseract: Issue 13593: tesseract-ocr/fuzzer-api: Undefined-shift in scaleGray2xLILineLow

Created on 8 Mar 2019  Â·  18Comments  Â·  Source: tesseract-ocr/tesseract

OSS Fuzz reports an issue here:

scale1.c:2717:28: runtime error: left shift of 255 by 24 places cannot be represented in type 'l_int32' (aka 'int')

See https://github.com/DanBloomberg/leptonica/blob/master/src/scale1.c#L2717
https://github.com/DanBloomberg/leptonica/blob/1.77.0/src/scale1.c#L2717.

@DanBloomberg, this is Leptonica code, so OSS Fuzz triggers a problem via the Tesseract API.

bug

All 18 comments

Looking at the code, I don't see the issue.

All the words are uint32, all the 8 bit pixel values are int32, but each is masked with 0xFF.

So the line in question is just a sum of 4 numbers, each <= 255, and then divided by 4.

Looks like it was fuzzed on 1.77.0, not the current head. So the issue is really on line 2725. And I do see the problem, something I'd have never thought about.

I believe the simple fix is to make all the sval* pixel values uint32. Then when one is pushed << 24, it's still valid. Will check this out.

That's right, it is currently using 1.77.0. I wonder whether @guidovranken could add Leptonica to OSS Fuzz, too. Ah, I see that it already exists: https://github.com/google/oss-fuzz/tree/master/projects/leptonica. @guidovranken, is it possible to add Dan and me to that configuration, so we can see the issues for Leptonica found so far?

You can make a PR against google/oss-fuzz that adds you and Dan to auto_ccs in https://github.com/google/oss-fuzz/blob/master/projects/leptonica/project.yaml

But I'm not in charge of the leptonica project or oss-fuzz.

CC @Dor1s @jonathanmetzman

Fix is in at leptonica master.

Is there some place we should log this, or will the fuzzer eventually run on the fix and consider the issue cleared, or ...?

The oss-fuzz system will automatically detect the fix in the next build.

Really? It's in Leptonica, and I would have expected that an update of the configuration is needed. Maybe it can use Leptonica master instead of 1.77.0.

Right, it's using a static version now.. I'll make a PR against the tesseract-ocr oss-fuzz project to use Leptonica master.

@stweil I'll send out a PR to add you as a cc here: https://github.com/google/oss-fuzz/blob/master/projects/leptonica/project.yaml . What email should I use?

Thanks, you can use the same e-mail address as for tesseract-ocr.

Maybe it can use Leptonica master instead of 1.77.0.

Should tesseract also change dependency to the new/fixed version of leptonica?

Up to now there exists no Leptonica release with the fix (it's only in Git master).

So the question is how to get the fix into the major Linux distributions. I think this requires new Leptonica releases 1.74.5, 1.75.4, 1.76.1 and 1.77.1. Otherwise Debian and others would have to patch the code. CC @jbreiden as the maintainer.

@DanBloomberg : What is the plan for 1.78.0 release?

No specific plan. I can do it within the next three weeks.

AFAICT we do not _need_ a release for this bug. I haven't run the fuzzer, but I can't believe there is any memory corruption. The only thing the bug can do is make a few bad pixels in an 8 bpp grayscale image when that particular fast 2x upscaler is used.

You are right, it's not a security problem. Up to now OSS-Fuzz only detected one real security issue (#2298). But of course for OCR a few bad pixels might result in wrong text recognition, so it is good that this is fixed now.

OSS-Fuzz has verified the fix in Leptonica, so this issue can be closed. See details here (will be visible for everyone in 30 days).

actually I can't see it ... perhaps one needs to be on the auto_css list at
oss-fuzz/projects/tesseract-ocr ?

but it's likely not important for me to be on it.

On Sun, Mar 10, 2019 at 10:39 AM Stefan Weil notifications@github.com
wrote:

OSS-Fuzz has verified the fix in Leptonica, so this issue can be closed.
See details here
https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=13593.

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/tesseract-ocr/tesseract/issues/2300#issuecomment-471327019,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AP6mLCV9OX_L9CPiM9UcBfWpyKt1wDyVks5vVUNVgaJpZM4blYXw
.

https://github.com/google/oss-fuzz/pull/2230 will add @stweil and @DanBloomberg to leptonica auto_ccs.

Was this page helpful?
0 / 5 - 0 ratings