hi all,
im trying to run the ./bin/run-ldc93s1.sh script. i have installed all python dependencies (tensorflow, numpy etc), extracted the pre-built deepspeech binary (?) from the 'native_client.tar.xz' file into the repository's 'native client' directory and specified it (see below) but im running into trouble. heres a sample output:
./bin/run-ldc93s1.sh
WARNING: libdeepspeech failed to load, resorting to deprecated code
I STARTING Optimization
I Training of Epoch 0 - loss: 332.397491
I Training of Epoch 1 - loss: 278.272827
I Training of Epoch 2 - loss: 185.577194
I Training of Epoch 3 - loss: 177.880112
I Training of Epoch 4 - loss: 207.362778
I FINISHED Optimization - training time: 0:00:09
Loading the LM will be faster if you build a binary file.
Reading data/lm/lm.binary
----5---10---15---20---25---30---35---40---45---50---55---60---65---70---75---80---85---90---95--100
terminate called after throwing an instance of 'lm::FormatLoadException'
what(): native_client/kenlm/lm/read_arpa.cc:65 in void lm::ReadARPACounts(util::FilePiece&, std::vector
first non-empty line was "version https://git-lfs.github.com/spec/v1" not \data. Byte: 43
Aborted (core dumped)
any ideas what is wrong here?
thanks,
seb
Hi Seb. You don t have a lm.binary file !
Your one is just a text file
Check deepspeech issues to find how to create a lm.binary file with kenlm tools
No, you don't need to create your own lm.binary, the problem is you haven't installed Git LFS properly. Make sure Git LFS is installed properly before you clone the repository.
Oups !
Well, my language is French, so I need complete setup...
SebastianScherer88, it should be easier for U.
@reuben: just to clarify:
decoder_library_path - leads to the extracted content of the native_client tar
lm_binary_path - leads to the kenlm language model that you created and is part of repo, to be downloaded while cloning using git lfs
lm_trie_path - leads to another (?) language model that you created and is part of repo, to be downloaded while cloning using git lfs
--decoder_library_path should point to the libctc_decoder_with_kenlm.so file that is in the native_client archive.
--lm_binary_path should point to data/lm/lm.binary (that's the default value, so you don't need to change it)
--lm_trie_path should point to data/lm/trie (that's the default value, so you don't need to change it)
ok, thanks a lot, i'll just try recloning with git lfs properly and check back if things go wrong. feel free to close, thanks again!
Update: Its working, case closed! :)
This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.
Most helpful comment
Update: Its working, case closed! :)