Where did you find 'snum`?
Tesseract installed from Homebrew on macOS 10.14.4.
There is reference to it here: https://github.com/Homebrew/homebrew-core/pull/36786
$ tesseract --version
tesseract 4.0.0
leptonica-1.78.0
libgif 5.1.4 : libjpeg 9c : libpng 1.6.37 : libtiff 4.0.10 : zlib 1.2.11 : libwebp 1.0.2 : libopenjp2 2.3.1
Found AVX2
Found AVX
Found SSE
$ tesseract --list-langs
List of available languages (3):
eng
osd
snum
You should contact its packager for explanation. This issues tracker is only for project/files provided by tesseract team (e.g. you need to refer to https://github.com/tesseract-ocr)
Thanks, just knowing that it wasn't a real language was enough to help me figure out what was going on. I was able to find out that snum = Serial Number. Looks like a third party serial number identification.
https://memex.jpl.nasa.gov/GHCI16.pdf
resource "snum" do
url "https://github.com/USCDataScience/counterfeit-electronics-tesseract/raw/319a6eeacff181dad5c02f3e7a3aff804eaadeca/Training%20Tesseract/snum.traineddata"
sha256 "36f772980ff17c66a767f584a0d80bf2302a1afa585c01a226c1863afcea1392"
end
Hopefully someone else might find this issue in the future and it help them.
@johnthagen Thanks for posting the info.
snum seems to be trained for legacy/base tesseract so it might require to be used with --oem 0 with tesseract4.
Thanks for the info @Shreeshrii!
Most helpful comment
Thanks, just knowing that it wasn't a real language was enough to help me figure out what was going on. I was able to find out that
snum= Serial Number. Looks like a third party serial number identification.https://github.com/varenc/homebrew-core/blob/251f7b8d16ee286d80de02e19882a350439a59d0/Formula/tesseract.rb#L39
https://memex.jpl.nasa.gov/GHCI16.pdf
Hopefully someone else might find this issue in the future and it help them.