Espnet: Question: Train a pre-trained TTS transformer phoneme based with new data

Created on 7 May 2020  路  4Comments  路  Source: espnet/espnet

I'm trying to train a pre-trained transformer TTS model based on phonemes (not characters) and realized that I could not do so as the new data had new phonemes, phonemes that did not exist in the original dataset. I'm using the LJSpeech dataset with a transformer.v3.single.

My question is, is it possible to create a new model with the new inputs. Then somehow load the weights corresponding to the inputs that were already trained, and leave new inputs as random (or something like that). This way the model could theoretically train with both new and old phonemes.

Question TTS

All 4 comments

It is possible by hacking the source code and correctly extend the phoneme set and corresponding matrix elements, but we don't support such functions.

Maybe slightly different from what you want to do, but we support partial loading of pretrained model.

See https://github.com/espnet/espnet/issues/1889#issuecomment-621245702.

By using this option, we can load the pretrained model except for embedding part.

I'll check it out! thanks

I finally was able to retrain a phoneme based model by doing something in the line of editing the run.sh file to load the dictionary file and not change it in stage 2 (based on #1887 )

Was this page helpful?
0 / 5 - 0 ratings