Fairseq: Error with mutilingual model and fairseq-generate

Created on 19 Nov 2019  路  5Comments  路  Source: pytorch/fairseq

Hi,
I got an error when generating with a trained multilingual model. I hope you can help me understand what went wrong and how to fix it. Some context: I'm basically trying to use the multilingual architecture as a multitask model to combine different datasets for a monolingual task (each task is a "language" pair).

The command used for training a many-to-one model (i.e. shared decoder) is:

CUDA_VISIBLE_DEVICES=1,2,3 fairseq-train "${data_dir}/bin" \
  --ddp-backend=no_c10d \
  --task multilingual_translation --lang-pairs orig-simp,complex-simp,long-simp \
  --arch multilingual_transformer \
  --share-decoders --share-decoder-input-output-embed \
  --encoder-embed-path "${glove}" --encoder-embed-dim 300 --encoder-ffn-embed-dim 300 \
  --decoder-embed-path "${glove}" --decoder-embed-dim 300 --decoder-ffn-embed-dim 300 \
  --encoder-attention-heads 5 --decoder-attention-heads 5 \
  --encoder-layers 4 --decoder-layers 4 \
  --optimizer adam --adam-betas '(0.9, 0.98)' \
  --lr 0.0005 --lr-scheduler inverse_sqrt --min-lr '1e-09' \
  --label-smoothing 0.1 --dropout 0.3 --weight-decay 0.0001 \
  --criterion label_smoothed_cross_entropy --max-update 10000 \
  --warmup-updates 4000 --warmup-init-lr '1e-07' \
  --max-tokens 4000 --update-freq 4 \
  --save-dir "${model_dir}" --tensorboard-logdir "${log_dir}" \

Training proceeds without problems. Now, I want to generate the output for the 'test' subset of one of the "language" pairs (orig-simp) that the model was trained on.

fairseq-generate "${data_dir}/bin" \
  --path "${model_dir}/${checkpoint_name}.pt" \
  --lang-pairs orig-simp,complex-simp,long-simp \
  --task multilingual_translation --source-lang orig --target-lang simp \
  --batch-size 128 --beam 5 --remove-bpe=sentencepiece \
  --gen-subset test > "${experiment_dir}/outputs/${output_name}.out"

After running the command I get the following error:

/experiments/falva/tools/fairseq/fairseq/models/fairseq_model.py:280: UserWarning: FairseqModel is deprecated, please use FairseqEncoderDecoderModel or BaseFairseqModel instead
  for key in self.keys
Traceback (most recent call last):
  File "/home/falva/anaconda3/envs/mtl4ts/bin/fairseq-generate", line 11, in <module>
    load_entry_point('fairseq', 'console_scripts', 'fairseq-generate')()
  File "/experiments/falva/tools/fairseq/fairseq_cli/generate.py", line 190, in cli_main
    main(args)
  File "/experiments/falva/tools/fairseq/fairseq_cli/generate.py", line 47, in main
    task=task,
  File "/experiments/falva/tools/fairseq/fairseq/checkpoint_utils.py", line 167, in load_model_ensemble
    ensemble, args, _task = load_model_ensemble_and_task(filenames, arg_overrides, task)
  File "/experiments/falva/tools/fairseq/fairseq/checkpoint_utils.py", line 186, in load_model_ensemble_and_task
    model.load_state_dict(state['model'], strict=True, args=args)
TypeError: load_state_dict() got an unexpected keyword argument 'args'

Could you help me understand what's going on?

Some additional and perhaps useful information and questions:

  • For all 'language' pairs, all dataset splits (train/valid/test) were binarized before training using fairseq-preprocess. That's why I decided to use fairseq-generate instead of fairseq-interactive. I don't think this could be the source of the problem, right? Or is there a particular reason why, for the multilingual model, it's recommended to use interactive rather than generate as in the example you provide in your repo?
  • Since in this case I'm using a many-to-one model (just as in the example you provide), there is no need to use the --encoder-langtok or --decoder-langtok arguments. To my understanding, --encoder-langtok comes into play if I wanted to train a one-to-many model (--encoder-langtok tgt). But, when would --decoder-langtok be necessary in your experience?

Thank you in advance for all the help.

bug

Most helpful comment

+1.
I ran the commands shown in https://github.com/pytorch/fairseq/tree/master/examples/translation#multilingual-translation exactly as they are written, but during generation it gives the "missing --lang-pairs" error. I then add --lang-pairs de-en,fr-en, and it gives the error: TypeError: load_state_dict() got an unexpected keyword argument 'args'

Extra info: I even tried adding an "args" argument to the load_state_dict() method infairseq/models/multilingual_transformer.py, but then it gives the error:

 Traceback (most recent call last):
File "/home/ubuntu/anaconda3/envs/pytorch_p36/bin/fairseq-interactive", line 11, in
load_entry_point('fairseq', 'console_scripts', 'fairseq-interactive')()
File "/home/ubuntu/NMT_project/fairseq_cli/interactive.py", line 190, in cli_main
main(args)
File "/home/ubuntu/NMT_project/fairseq_cli/interactive.py", line 149, in main
translations = task.inference_step(generator, models, sample)
File "/home/ubuntu/NMT_project/fairseq/tasks/multilingual_translation.py", line 309, in inference_step
if self.args.decoder_langtok else self.target_dictionary.eos(),
File "/home/ubuntu/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/torch/autograd/grad_mode.py", line 49, in decorate_no_grad
return func(args, *kwargs)
File "/home/ubuntu/NMT_project/fairseq/sequence_generator.py", line 113, in generate
return self._generate(model, sample, *kwargs)
File "/home/ubuntu/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/torch/autograd/grad_mode.py", line 49, in decorate_no_grad
return func(
args, *kwargs)
File "/home/ubuntu/NMT_project/fairseq/sequence_generator.py", line 295, in _generate
tokens[:, :step + 1], encoder_outs, temperature=self.temperature,
File "/home/ubuntu/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/torch/autograd/grad_mode.py", line 49, in decorate_no_grad
return func(
args, **kwargs)
File "/home/ubuntu/NMT_project/fairseq/sequence_generator.py", line 553, in forward_decoder
temperature=temperature,
File "/home/ubuntu/NMT_project/fairseq/sequence_generator.py", line 583, in _decode_one
decoder_out = list(model.forward_decoder(
File "/home/ubuntu/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/torch/nn/modules/module.py", line 591, in __getattr__
type(self).__name__, name))
AttributeError: 'MultilingualTransformerModel' object has no attribute 'forward_decoder'

All 5 comments

+1

+1.
I ran the commands shown in https://github.com/pytorch/fairseq/tree/master/examples/translation#multilingual-translation exactly as they are written, but during generation it gives the "missing --lang-pairs" error. I then add --lang-pairs de-en,fr-en, and it gives the error: TypeError: load_state_dict() got an unexpected keyword argument 'args'

Extra info: I even tried adding an "args" argument to the load_state_dict() method infairseq/models/multilingual_transformer.py, but then it gives the error:

 Traceback (most recent call last):
File "/home/ubuntu/anaconda3/envs/pytorch_p36/bin/fairseq-interactive", line 11, in
load_entry_point('fairseq', 'console_scripts', 'fairseq-interactive')()
File "/home/ubuntu/NMT_project/fairseq_cli/interactive.py", line 190, in cli_main
main(args)
File "/home/ubuntu/NMT_project/fairseq_cli/interactive.py", line 149, in main
translations = task.inference_step(generator, models, sample)
File "/home/ubuntu/NMT_project/fairseq/tasks/multilingual_translation.py", line 309, in inference_step
if self.args.decoder_langtok else self.target_dictionary.eos(),
File "/home/ubuntu/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/torch/autograd/grad_mode.py", line 49, in decorate_no_grad
return func(args, *kwargs)
File "/home/ubuntu/NMT_project/fairseq/sequence_generator.py", line 113, in generate
return self._generate(model, sample, *kwargs)
File "/home/ubuntu/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/torch/autograd/grad_mode.py", line 49, in decorate_no_grad
return func(
args, *kwargs)
File "/home/ubuntu/NMT_project/fairseq/sequence_generator.py", line 295, in _generate
tokens[:, :step + 1], encoder_outs, temperature=self.temperature,
File "/home/ubuntu/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/torch/autograd/grad_mode.py", line 49, in decorate_no_grad
return func(
args, **kwargs)
File "/home/ubuntu/NMT_project/fairseq/sequence_generator.py", line 553, in forward_decoder
temperature=temperature,
File "/home/ubuntu/NMT_project/fairseq/sequence_generator.py", line 583, in _decode_one
decoder_out = list(model.forward_decoder(
File "/home/ubuntu/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/torch/nn/modules/module.py", line 591, in __getattr__
type(self).__name__, name))
AttributeError: 'MultilingualTransformerModel' object has no attribute 'forward_decoder'

@pipibjc ?

The problem seems to be dabbef467692ef4ffb7de8a01235876bd7320a93. If you can add , args=None to load_state_dict in multilingual_transformer.py of your local checkout it should fix it. I'll submit a fix soon. Specifically, it should be:

    def load_state_dict(self, state_dict, strict=True, args=None):
        state_dict_subset = state_dict.copy()
        for k, _ in state_dict.items():
            assert k.startswith('models.')
            lang_pair = k.split('.')[1]
            if lang_pair not in self.models:
                del state_dict_subset[k]
        super().load_state_dict(state_dict_subset, strict=strict, args=args)

Should be fixed with 4c5934ac61354d9b6d164f7317905e4ac2ae1064

Was this page helpful?
0 / 5 - 0 ratings