Fairseq: how do i use the features get from wav2vec in wav2letter++

Created on 27 Sep 2019  路  11Comments  路  Source: pytorch/fairseq

hi, i train the wav2vec, and get the model parameters, then, how do i use the xx.pt to train wav2letter, for i want see the result of asr

question

Most helpful comment

Can anybody help a bit here. We are kind of stuck! After extracting the embeddings from the downstream data, how do we now provide them to wav2letter++ ? There is not any documnetation available for that.
@alexeib @myleott

All 11 comments

i save the result for kaldi data and training my asr model rather than wav2letter++ model

@leixiaoning can you provide some details about this please? thank you.

@leixiaoning did you figure it out? Because I too am stuck at the same point. Two questions in fact,:

  1. What are the task wavs in PYTHONPATH /path/to/fairseq python scripts/wav2vec_featurize.py --input /path/to/task/waves --output /path/to/output
    --model /model/path/checkpoint_best.pt --split train valid test ?
  2. How are train, valid test fed to wav2letter++ ? and what is their output format

Can anybody help a bit here. We are kind of stuck! After extracting the embeddings from the downstream data, how do we now provide them to wav2letter++ ? There is not any documnetation available for that.
@alexeib @myleott

I tried to train the speech model (deepspeech2) on Librispeech using context representations (C) extracted from Pre-trained wav2vec model provided in Repo but model is not converging after several epochs.
@alexeib any help on this??
did you guys changed the architecture of the model to make it working or you achieved state of the art result by just replacing Spectogram by context representation and using same architecture shown in (deepspeech2 or wave2letter ) paper ??

Hi @rajeevbaalwan !
Can you please share how did you incorporated these embeddings in the DeepSpeech2 model.
I have been struggling with it since a long time.
Thanks in advance!

I am needing advice on this topic. Thanks

@rajeevbaalwan @alexeib
can anybody elaborate on this please?

@leixiaoning @marcosmacedo check the issues of wav2letter. https://github.com/facebookresearch/wav2letter/issues/436
@maltium has a fork that accepts hdf5 as input https://github.com/maltium/wav2letter/tree/feature/loading-from-hdf5

sorry i just saw this. we just replaced spectrogram features in wav2letter with the wav2vec ones. you can extract the features as shown in the examples doc and feed it into any asr system youd like and it will work (e.g. we have tried bi-lstms also)

@alexeib could you share your wav2letter hyperparams and lr please?

Was this page helpful?
0 / 5 - 0 ratings