Fairseq: Multilingual XLSR model availability? (wav2vec2)

Created on 1 Sep 2020 · 2Comments · Source: pytorch/fairseq

The paper Unsupervised Cross-lingual Representation Learning for Speech Recognition describes training the wav2vec2 model with multilingual data (Mozilla CommonVoice and other datasets), rather than just English data as in the wav2vec2 models available already in fairseq. I notice that several authors are the same (including @alexeib), and they mention that the model was developed in fairseq. I wonder if they might consider making the pretrained models public?

Thank you for releasing the wav2vec2 models. I've tried finetuning them on other languages, and as expected (from the cross-language monolingual experiments described in the XLSR paper, table 1), the performance is worse on languages other than English, especially lower-resource ones. I think a multilingual pretrained model would be very valuable to a wider community, in addition to the already valuable English models.

enhancement help wanted

Source