Tesseract: setRectangle maybe has an odd thing

Created on 27 Apr 2017 · 14Comments · Source: tesseract-ocr/tesseract

Today，I want to set a rectangle area to recognition，but I find the parameters may be not explain as baseapi.h .
void setrectangle（int left,int top,int width,int height）;
when I setrectangle(126,40,1152,28),it will recognizing (0,0,1152,28),I don't know why.

Look forward to your reply，thank you！

Source

hanoilu

Most helpful comment

yes, you can use version 3.x instead of version 4.0 if you really need to use the SetRectangle call. Alternatively, you can create an image corresponding to the rectangle you want to recognise, and recognise that instead; it won't be exactly equivalent as the bounding boxes would be shifted compared to the ones in the original image, but it is easy to correct the bounding boxes.

bpotard on 2 May 2017

😄1 👍1

All 14 comments

hanoilu on 27 Apr 2017

Hello,

SetRectangle appears to be broken in v4, cf: https://github.com/sirfz/tesserocr/issues/26

In the meantime, you are probably better off creating a sub-image yourself and performing OCR on it.

bpotard on 28 Apr 2017

👍1

@bpotard
Thank you very much！I'll try~

hanoilu on 28 Apr 2017

A bit more details, with a minor change to the base API to use the SetRectangle API call just after loading an image:

diff --git a/api/baseapi.cpp b/api/baseapi.cpp
index 8b2ef07..a63d9f6 100644
--- a/api/baseapi.cpp
+++ b/api/baseapi.cpp
@@ -1204,6 +1204,7 @@ bool TessBaseAPI::ProcessPage(Pix* pix, int page_index, const char* filename,
   PERF_COUNT_START("ProcessPage")
   SetInputName(filename);
   SetImage(pix);
+  SetRectangle(36, 92, 544, 30);
   bool failed = false;

   if (tesseract_->tessedit_pageseg_mode == PSM_AUTO_ONLY) {

Then run tesseract on testing/phototest.tif.

With branch 3.04:

$ tesseract -psm 6 tesseract/testing/phototest.tif stdout
TIFFFetchNormalTag: Warning, ASCII value for tag "Photoshop" does not end in null byte. Forcing it to be null.
Page 1
This is a lot of 12 point text to test the

TIFFFetchNormalTag: Warning, ASCII value for tag "Photoshop" does not end in null byte. Forcing it to be null.

With master branch:

$tesseract -psm 6 tesseract/testing/phototest.tif stdout
Page 1
s

leptonica-1.74.1 is used in both cases, both on clean debian/jessie64 VMs with identical configurations. ./configure was ran with no option. Without the SetRectangle call, both tesseract versions generate perfect output.

bpotard on 28 Apr 2017

@bpotard so,you mean 4.0version have a bug?we can use the portion about "box" of other version instead of 4.0?

hanoilu on 2 May 2017

@bpotard and how free the memory?delete [] utf8text? I run tesseract to handle more than 500 image ,but it tell me mot memory

hanoilu on 2 May 2017

bpotard on 2 May 2017

😄1 👍1

it won't be exactly equivalent as the bounding boxes would be shifted compared to the ones in the original image,

@bpotard can you elaborate on why the bounding boxes would be shifted?

abieler on 14 Feb 2018

@abieler Because the bounding boxes in each elements (paragraph, word, etc.) would be relative to the "new" sub-image you have created rather than the original image - while setRectangle would normally return bounding boxes relative to the original image. So if you need to know where the recognised text comes from precisely in the original image, you would need to do an additional step to have their exact position: you would need to shift the returned bounding boxes in the original coordinate space... which is admittedly not very hard to do: you just need to add the coordinates of the top left corner of your extracted sub-image to all bounding boxes.

bpotard on 14 Feb 2018

Thanks @bpotard ! I just started using the API and setRectangle and found that the ocr quality is very sensitive to the size of the bounding box, where 1 px more or less on the y axis makes a huge difference, even though there seems no reason by looking at those regions by eye. Is there a "best practice" on how many pixels there should be between the last pixel row of the characters and the bounding box? say, text height is 30 px, then the boundingbox should be 40 px, adding 5 extra pixels on each side (which seems to work ok in my case..) Sorry I should not abuse github issues for this kind of questions...

abieler on 14 Feb 2018

If you are using the master branch, SetRectangle is probably still broken so will not work - the bug has not been fixed as far as I know. If you really need the functionality, either use the 3.x branch of tesseract, or create you own sub-images and process them as whole images using the normal API. Do not use SetRectangle in tesseract 4.x.

Alternatively, you can try to figure out where the bug in SetRectangle comes from and fix it :-)

bpotard on 14 Feb 2018

Sorry, that was my mistake, I actually am using sub-images and not setRectangle...

abieler on 14 Feb 2018

Can you please provide test case that can demonstrate your problem? I can not reproduce with:

Pix *image = pixRead("/usr/src/testing/phototest.tif");
api->SetImage(image);
api->SetRectangle(36, 126, 582, 31);
outText = api->GetUTF8Text();
printf("Region text:\n%s\n", outText);

and results is:

Region test:
ocr code and see if it works on all types

Which is exactly what e.g. gimp shows for this area. Or do I miss something?

zdenop on 29 Sep 2018

Closed as not reproducible with current code.

zdenop on 7 Oct 2018

Was this page helpful?

0 / 5 - 0 ratings

Related issues

How to specify min confidence "level" below which replacement char should be used?

spajak · 4Comments

unknown command line argument '-psm'

YeisonVelez11 · 5Comments

Tesseract 4.0 crash with Capture2Text_CLI

garry-ut99 · 5Comments

Compiling on Windows failed when executing SW

ivder · 7Comments

Are there more PSM modes than are listed in the help/wiki - 11 and 12?

samiles · 4Comments