Espnet: Question: Train a pre-trained TTS transformer phoneme based with new data

Created on 7 May 2020 · 4Comments · Source: espnet/espnet

I'm trying to train a pre-trained transformer TTS model based on phonemes (not characters) and realized that I could not do so as the new data had new phonemes, phonemes that did not exist in the original dataset. I'm using the LJSpeech dataset with a transformer.v3.single.

My question is, is it possible to create a new model with the new inputs. Then somehow load the weights corresponding to the inputs that were already trained, and leave new inputs as random (or something like that). This way the model could theoretically train with both new and old phonemes.

Question TTS

Source

marcosnetopires

All 4 comments

It is possible by hacking the source code and correctly extend the phoneme set and corresponding matrix elements, but we don't support such functions.

sw005320 on 8 May 2020

Maybe slightly different from what you want to do, but we support partial loading of pretrained model.

See https://github.com/espnet/espnet/issues/1889#issuecomment-621245702.

By using this option, we can load the pretrained model except for embedding part.

kan-bayashi on 8 May 2020

👍1

I'll check it out! thanks

marcosnetopires on 8 May 2020

I finally was able to retrain a phoneme based model by doing something in the line of editing the run.sh file to load the dictionary file and not change it in stage 2 (based on #1887 )

marcosnetopires on 21 May 2020

👍1

Was this page helpful?

0 / 5 - 0 ratings

Related issues

Error when installing espnet

ghost · 5Comments

Speaker mumbling when synthesize very long sentence

enamoria · 4Comments

First multi-speaker Transformer

yyggithub · 4Comments

Implement TCEN for speech translation

CherrieWang97 · 4Comments

RuntimeError: Error(s) in loading state_dict for Transformer: size mismatch for encoder.embed.0.weight: copying a param with shape torch.Size([43, 384]) from checkpoint, the shape in current model is torch.Size([37, 384]).

thrfdth · 4Comments