Tesseract: Error in boxClipToRectangle: box outside rectangle

Created on 14 Sep 2016  路  8Comments  路  Source: tesseract-ocr/tesseract

Hi there, I've got some specific images that output the following on linux:

Tesseract Open Source OCR Engine v3.05.00dev with Leptonica
Error in boxClipToRectangle: box outside rectangle
Error in pixScanForForeground: invalid box

The pictures get successfully OCRed in tesseract (without great results tho). The biggest problem for me, however, is that in OCRopus they don't even get OCRed.

example5
ghoby30c

Any ideas?

bug

Most helpful comment

Error in boxClipToRectangle: box outside rectangle
Error in pixScanForForeground: invalid box

Add a white/black frame to the image and no error messages will appear.

convert  427-1.jpg  -bordercolor White -border 10x10 427-1b.jpg

Strange behaviour...

All 8 comments

Error in boxClipToRectangle: box outside rectangle
Error in pixScanForForeground: invalid box

Add a white/black frame to the image and no error messages will appear.

convert  427-1.jpg  -bordercolor White -border 10x10 427-1b.jpg

Strange behaviour...

The biggest problem for me, however, is that in OCRopus they don't even get OCRed.

This place is for bug reports about Tesseract, not OCRopus.

@amitdo I'm getting the same issue just with Tesseract. I'm guessing OCRopus is using Tesseract and that's why he made the issue here.

I'm guessing OCRopus is using Tesseract

Ocropy (and clstm) does not use Tesseract. A VERY OLD version of Ocropus (0.4) did use Tesseract.

Similar issues #468 #1601

These error messages are produced by Leptonica.

They are triggered by a call to pixClipBoxToForeground()

https://github.com/DanBloomberg/leptonica/blob/bbe289cf3f0fe368d5b9eac64df2ccd6e9b05c56/src/pix5.c#L1956

https://github.com/tesseract-ocr/tesseract/search?q=pixClipBoxToForeground

@stweil, this seems like a bug in Tesseract, maybe you can explore it and find its cause.

https://github.com/tesseract-ocr/tesseract/search?q=pixClipBoxToForeground

I noticed that Tesseract does not check the return value from Leptonica's functions (l_ok).

@stweil, this seems like a bug in Tesseract, maybe you can explore it and find its cause.

It's caused by a box with width / height 0, but as always in Tesseract it is difficult to find the right fix.

Was this page helpful?
0 / 5 - 0 ratings