Tesseract: Application using TessBaseAPI crashing due to copying issues

Created on 4 May 2017  路  11Comments  路  Source: tesseract-ocr/tesseract

Well, I'm trying to implement an application using tesseract API, but so far I couldn't get it working as it crashes always when trying to clean-up all the memory.

After a few hours of dissasembly I've come to the conclusion that the problem is due to TessBaseAPI not having copy-constructor or assignation operator.

I've got a class called OcrEngine, this one creates an instance of TessBaseAPI (let's call it instance T_A), and when I copy an instance of OcrEngine, OcrEngine copy-constructor is called which consequently calls compiler-generated TessBaseAPI copy-constructor, generating a new TessBaseAPI (let's call it instance T_B). As of compiler-generated copy-constructors performs a "fool-copy", both instance T_A and T_B shares the same attributes. And when T_A is destroyed, values of T_B point to non-existant memory regions.


So, if i'm not wrong a simply solution to this problem might be to implement copy-constructors and assignation operators for TessBaseAPI class and the ones instanced by it.

Meanwhile when I make a copy of my class OcrEngine, I just create a new instance of TessBaseAPI and try to replicate all the original TessBaseAPI object configuration.

bug feature request

All 11 comments

Bottom line - Don't copy it.

I don't think that's the solution. The reason why I want to use copy constructor is to avoid allocating sparse memory regions. Moreover the main reason why I need to copy then is to avoid having to reconfigure the class many times in case I'd had a OCR thread pool...

In my opinion I don't see why copy constructors/assignation operators are evil. I mean the main idea is to use then but there are lots of situations where you need it. I just think is a dumb reasoning...

Moreover, If only TessBaseAPI were using reference-counting for all its internal objects, I wouldn't need to implement these methods, but that's not the case..

@stweil, I think copying the api object should be disabled with C++11 =delete.

@stweil, another try...

Please review pull request #874 #2721.

Copying is now no longer possible, so I close this issue. An implementation with reference counting can be implemented later. Personally I think there are other things with a higher priority before that.

The removal of the copy constructor breaks tesserocr which currently uses it, maybe here. I still don't know how to fix that. @sirfz?

You can try to ask about the needed changes in tesserocr here:

https://groups.google.com/group/cython-users

The solution was surprisingly easy: just delete unneeded code: https://github.com/sirfz/tesserocr/pull/200.

Was this page helpful?
0 / 5 - 0 ratings