Wav2letter: How to use wav2letter@anywhere in production?

Created on 20 Nov 2020 · 5Comments · Source: flashlight/wav2letter

Question

How do I use wav2letter@anywhere as a library?

Is there any example on how to deploy this in production? The pipeline example showcases w2l inference, but I assume this is not what you would want to use when you're building say an API.

I'm having trouble figuring this out myself since I'm new to Cmake.

question

Source

nihiluis

👍1

All 5 comments

Same question. Can't find clear guide about how to easily create and train model and use wav2letter to create an API

vovkapultik on 20 Nov 2020

For API I'd use AudioToWords.h, for example:
https://github.com/facebookresearch/flashlight/blob/36b20581d1c5ed4a8b69f7859223b690de0c3723/flashlight/app/asr/experimental/inference/inference/examples/AudioToWords.h#L22

https://github.com/facebookresearch/flashlight/blob/36b20581d1c5ed4a8b69f7859223b690de0c3723/flashlight/app/asr/experimental/inference/inference/examples/MultithreadedStreamingASRExample.cpp
Shows use example.

For training,
Train TDS+CTC model as described at:
https://github.com/facebookresearch/wav2letter/tree/master/recipes/sota/2019/librispeech#tds-ctc-training

and use the converter:
https://github.com/facebookresearch/flashlight/blob/36b20581d1c5ed4a8b69f7859223b690de0c3723/flashlight/app/asr/experimental/tools/StreamingTDSModelConverter.cpp

to convert it from training to inference module.

avidov on 20 Nov 2020

👍1

Ok, but how do I consume wav2letter@anywhere as a library? I just want to add it to my executable right now. Like pip install torch. The build structure is sort of complicated with how the inference part depends on wav2letter libraries but not wav2letter itself.

nihiluis on 21 Nov 2020

@nihiluis: There is no pip install equivalent for wav2letter++. It is a collection of C++ utilities that you would have to put together to create, say a streaming REST API. You can take a look at inference examples to understand how some of these utilities have been put together to create streaming ASR examples.