Do you intend to eventually support a decoder capable of streaming recognition? Given the claimed speed of the library it would definitely make sense to apply it in a real-time environment.
It's also interesting how much work exactly would be needed for that purpose - the CNN-based architecture looks like it could be more easily adapted to streaming than attention-based architectures.
@pzelasko - You are right. With CNN architectures it becomes easy to support streaming based decoder. We already have implemented streaming based decoder but it is not at a stage to open-source right now (since it is linked to internal FB tools for streaming utilities).
This is in our future plans but we can't guarantee any timelines.
@vineelpratap I'm really interested in being able to do streaming inference. Would you be open to PR implementing it? Or would that be potentially wasted effort if fb's internal decoder will come out soon?
@realdoug We will open-source the initial version of online decoder in a week. After it is released, we will be happy to discuss any API changes you might need to make it useful for your use case.
Hey @vineelpratap, are you still planning to release it? :)
cc @xuqiantong
@pzelasko — the online decoder is available as of https://github.com/facebookresearch/wav2letter/commit/eda528dd38a509e35cb96ddca492095ec88da1cc.
Most helpful comment
@pzelasko — the online decoder is available as of https://github.com/facebookresearch/wav2letter/commit/eda528dd38a509e35cb96ddca492095ec88da1cc.