Fairseq: Resume training by specifying the model you'd like to resume from using `--restore-file <path to checkpoint>`.

Created on 27 Sep 2019  路  7Comments  路  Source: pytorch/fairseq

Yes, you can resume training by specifying the model you'd like to resume from using --restore-file <path to checkpoint>.

_Originally posted by @lematt1991 in https://github.com/pytorch/fairseq/issues/1182#issuecomment-535507612_

First Model was trained on architecture LSTM and the second one was also LSTM with restore-file option. Both were being trained on separate data files (same language pair)
Error: Architecture mismatch.

image

Most helpful comment

@echan00
The bin files were created separately and also a common dictionary was created.
And this common dictionary was placed in the bin files for both the data sets replacing their actual dictionary created at the time of pre-processing.
Somehow it worked for me!

All 7 comments

your vocabulary size probably changed, see the "size mismatch"

Yes, the vocab size is different because they are 2 different data sets and not the same.

It's not going to be possible to restore from a checkpoint where the vocabulary size is different... the input/output embedding matrices are going to be the wrong size. This is not a bug with the code. You need to decide how you want to handle this, most likely you want to re-process your second dataset with the same dictionary as the first.

Hi @huihuifan
Do you suggest modifying the preprocess code right here to load a different dictionary (in this case, the dictionary from the first dataset)?

@aastha19 what did you end up doing?

Hi @echan00

I made a common dictionary for both the datasets and then used it to train the models separately.

Thanks @aastha19 would you mind showing me how you used the same dictionary in two separate trainings?

@echan00
The bin files were created separately and also a common dictionary was created.
And this common dictionary was placed in the bin files for both the data sets replacing their actual dictionary created at the time of pre-processing.
Somehow it worked for me!

Was this page helpful?
0 / 5 - 0 ratings