Hi,
I got an error when generating with a trained multilingual model. I hope you can help me understand what went wrong and how to fix it. Some context: I'm basically trying to use the multilingual architecture as a multitask model to combine different datasets for a monolingual task (each task is a "language" pair).
The command used for training a many-to-one model (i.e. shared decoder) is:
CUDA_VISIBLE_DEVICES=1,2,3 fairseq-train "${data_dir}/bin" \
--ddp-backend=no_c10d \
--task multilingual_translation --lang-pairs orig-simp,complex-simp,long-simp \
--arch multilingual_transformer \
--share-decoders --share-decoder-input-output-embed \
--encoder-embed-path "${glove}" --encoder-embed-dim 300 --encoder-ffn-embed-dim 300 \
--decoder-embed-path "${glove}" --decoder-embed-dim 300 --decoder-ffn-embed-dim 300 \
--encoder-attention-heads 5 --decoder-attention-heads 5 \
--encoder-layers 4 --decoder-layers 4 \
--optimizer adam --adam-betas '(0.9, 0.98)' \
--lr 0.0005 --lr-scheduler inverse_sqrt --min-lr '1e-09' \
--label-smoothing 0.1 --dropout 0.3 --weight-decay 0.0001 \
--criterion label_smoothed_cross_entropy --max-update 10000 \
--warmup-updates 4000 --warmup-init-lr '1e-07' \
--max-tokens 4000 --update-freq 4 \
--save-dir "${model_dir}" --tensorboard-logdir "${log_dir}" \
Training proceeds without problems. Now, I want to generate the output for the 'test' subset of one of the "language" pairs (orig-simp) that the model was trained on.
fairseq-generate "${data_dir}/bin" \
--path "${model_dir}/${checkpoint_name}.pt" \
--lang-pairs orig-simp,complex-simp,long-simp \
--task multilingual_translation --source-lang orig --target-lang simp \
--batch-size 128 --beam 5 --remove-bpe=sentencepiece \
--gen-subset test > "${experiment_dir}/outputs/${output_name}.out"
After running the command I get the following error:
/experiments/falva/tools/fairseq/fairseq/models/fairseq_model.py:280: UserWarning: FairseqModel is deprecated, please use FairseqEncoderDecoderModel or BaseFairseqModel instead
for key in self.keys
Traceback (most recent call last):
File "/home/falva/anaconda3/envs/mtl4ts/bin/fairseq-generate", line 11, in <module>
load_entry_point('fairseq', 'console_scripts', 'fairseq-generate')()
File "/experiments/falva/tools/fairseq/fairseq_cli/generate.py", line 190, in cli_main
main(args)
File "/experiments/falva/tools/fairseq/fairseq_cli/generate.py", line 47, in main
task=task,
File "/experiments/falva/tools/fairseq/fairseq/checkpoint_utils.py", line 167, in load_model_ensemble
ensemble, args, _task = load_model_ensemble_and_task(filenames, arg_overrides, task)
File "/experiments/falva/tools/fairseq/fairseq/checkpoint_utils.py", line 186, in load_model_ensemble_and_task
model.load_state_dict(state['model'], strict=True, args=args)
TypeError: load_state_dict() got an unexpected keyword argument 'args'
Could you help me understand what's going on?
Some additional and perhaps useful information and questions:
fairseq-preprocess. That's why I decided to use fairseq-generate instead of fairseq-interactive. I don't think this could be the source of the problem, right? Or is there a particular reason why, for the multilingual model, it's recommended to use interactive rather than generate as in the example you provide in your repo?--encoder-langtok or --decoder-langtok arguments. To my understanding, --encoder-langtok comes into play if I wanted to train a one-to-many model (--encoder-langtok tgt). But, when would --decoder-langtok be necessary in your experience?Thank you in advance for all the help.
+1
+1.
I ran the commands shown in https://github.com/pytorch/fairseq/tree/master/examples/translation#multilingual-translation exactly as they are written, but during generation it gives the "missing --lang-pairs" error. I then add --lang-pairs de-en,fr-en, and it gives the error: TypeError: load_state_dict() got an unexpected keyword argument 'args'
Extra info: I even tried adding an "args" argument to the load_state_dict() method infairseq/models/multilingual_transformer.py, but then it gives the error:
Traceback (most recent call last):
File "/home/ubuntu/anaconda3/envs/pytorch_p36/bin/fairseq-interactive", line 11, in
load_entry_point('fairseq', 'console_scripts', 'fairseq-interactive')()
File "/home/ubuntu/NMT_project/fairseq_cli/interactive.py", line 190, in cli_main
main(args)
File "/home/ubuntu/NMT_project/fairseq_cli/interactive.py", line 149, in main
translations = task.inference_step(generator, models, sample)
File "/home/ubuntu/NMT_project/fairseq/tasks/multilingual_translation.py", line 309, in inference_step
if self.args.decoder_langtok else self.target_dictionary.eos(),
File "/home/ubuntu/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/torch/autograd/grad_mode.py", line 49, in decorate_no_grad
return func(args, *kwargs)
File "/home/ubuntu/NMT_project/fairseq/sequence_generator.py", line 113, in generate
return self._generate(model, sample, *kwargs)
File "/home/ubuntu/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/torch/autograd/grad_mode.py", line 49, in decorate_no_grad
return func(args, *kwargs)
File "/home/ubuntu/NMT_project/fairseq/sequence_generator.py", line 295, in _generate
tokens[:, :step + 1], encoder_outs, temperature=self.temperature,
File "/home/ubuntu/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/torch/autograd/grad_mode.py", line 49, in decorate_no_grad
return func(args, **kwargs)
File "/home/ubuntu/NMT_project/fairseq/sequence_generator.py", line 553, in forward_decoder
temperature=temperature,
File "/home/ubuntu/NMT_project/fairseq/sequence_generator.py", line 583, in _decode_one
decoder_out = list(model.forward_decoder(
File "/home/ubuntu/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/torch/nn/modules/module.py", line 591, in __getattr__
type(self).__name__, name))
AttributeError: 'MultilingualTransformerModel' object has no attribute 'forward_decoder'
@pipibjc ?
The problem seems to be dabbef467692ef4ffb7de8a01235876bd7320a93. If you can add , args=None to load_state_dict in multilingual_transformer.py of your local checkout it should fix it. I'll submit a fix soon. Specifically, it should be:
def load_state_dict(self, state_dict, strict=True, args=None):
state_dict_subset = state_dict.copy()
for k, _ in state_dict.items():
assert k.startswith('models.')
lang_pair = k.split('.')[1]
if lang_pair not in self.models:
del state_dict_subset[k]
super().load_state_dict(state_dict_subset, strict=strict, args=args)
Should be fixed with 4c5934ac61354d9b6d164f7317905e4ac2ae1064
Most helpful comment
+1.
I ran the commands shown in https://github.com/pytorch/fairseq/tree/master/examples/translation#multilingual-translation exactly as they are written, but during generation it gives the "missing
--lang-pairs" error. I then add--lang-pairs de-en,fr-en, and it gives the error:TypeError: load_state_dict() got an unexpected keyword argument 'args'Extra info: I even tried adding an "
args" argument to theload_state_dict()method infairseq/models/multilingual_transformer.py, but then it gives the error: