Fairseq: Getting error while generating translations using fairseq-generate cli command.

Created on 11 Aug 2020  路  3Comments  路  Source: pytorch/fairseq

馃悰 Bug

I followed the below MASS repository for building an NMT model which is built on fairseq==0.7.1.
https://github.com/gvskalyan/MASS/tree/master/MASS-fairseq

I trained the MASS pre-trained model and fine-tuned it on DE --> EN translation task. While generating the translation using below command throws me below error.
Fairseq-Generate Command

MODEL='model_ckpt/fine_tune/checkpoint_best.pt'

!fairseq-generate '/content/drive/My Drive/MASS/data/processed' \
    -s de -t en \
    --user-dir mass \
    --langs de,en \
    --source-langs de --target-langs en \
    --mt_steps de-en \
    --gen-subset valid \
    --task xmasked_seq2seq \
    --path $MODEL \
    --beam 5 \
        --sacrebleu \
    --remove-bpe

Error received

Namespace(beam=5, cpu=False, criterion='cross_entropy', data='/content/drive/My Drive/MASS/data/processed', dataset_impl='cached', diverse_beam_groups=-1, diverse_beam_strength=0.5, force_anneal=None, fp16=False, fp16_init_scale=128, fp16_scale_tolerance=0.0, fp16_scale_window=None, gen_subset='valid', langs='de,en', lazy_load=False, left_pad_source='True', left_pad_target='False', lenpen=1, lm_bias=False, log_format=None, log_interval=1000, lr_scheduler='fixed', lr_shrink=0.1, mass_steps='', match_source_len=False, max_len_a=0, max_len_b=200, max_sentences=None, max_source_positions=1024, max_target_positions=1024, max_tokens=12000, memory_efficient_fp16=False, memt_steps='', min_len=1, min_loss_scale=0.0001, model_overrides='{}', momentum=0.99, mt_steps='de-en', nbest=1, no_beamable_mm=False, no_early_stop=False, no_progress_bar=False, no_repeat_ngram_size=0, num_shards=1, num_workers=0, optimizer='nag', path='model_ckpt/fine_tune/checkpoint_best.pt', prefix_size=0, print_alignment=False, quiet=False, raw_text=False, reload_checkpoint=None, remove_bpe='@@ ', replace_unk=None, required_batch_size_multiple=8, results_path=None, sacrebleu=True, sampling=False, sampling_topk=-1, score_reference=False, seed=1, shard_id=0, skip_invalid_size_inputs_valid_test=False, source_lang='de', source_langs='de', target_lang='en', target_langs='en', task='xmasked_seq2seq', tbmf_wrapper=False, temperature=1.0, tensorboard_logdir='', threshold_loss_scale=None, unkpen=0, unnormalized=False, user_dir='mass', valid_lang_pairs='', warmup_updates=0, weight_decay=0.0, word_mask=0.25, word_mask_keep_rand='0.1,0.1,0.8')
| [de] dictionary: 32813 types
| [en] dictionary: 32698 types
| bilingual valid de-en.de: 22139 examples
| bilingual valid de-en.en: 22139 examples
| loading model(s) from model_ckpt/fine_tune/checkpoint_best.pt
tcmalloc: large alloc 2082308096 bytes == 0x15beca000 @  0x7f2594e6b1e7 0x7f25828ea5b6 0x7f25911cd470 0x7f2590eaab20 0x566ddc 0x50a783 0x50c1f4 0x507f24 0x509c50 0x50a64d 0x50c1f4 0x507f24 0x509202 0x5a4d81 0x5a50d8 0x4e01be 0x50a7b1 0x50c1f4 0x507f24 0x588fac 0x59fe1e 0x50d596 0x507f24 0x509c50 0x50a64d 0x50cfd6 0x507f24 0x509c50 0x50a64d 0x50c1f4 0x509918
tcmalloc: large alloc 2082308096 bytes == 0x1d88a2000 @  0x7f2594e6b1e7 0x7f25828ea5b6 0x7f25911cd470 0x7f2590eaab20 0x566ddc 0x50a783 0x50c1f4 0x507f24 0x509c50 0x50a64d 0x50c1f4 0x507f24 0x509202 0x5a4d81 0x5a50d8 0x4e01be 0x50a7b1 0x50c1f4 0x507f24 0x588fac 0x59fe1e 0x50d596 0x507f24 0x509c50 0x50a64d 0x50cfd6 0x507f24 0x509c50 0x50a64d 0x50c1f4 0x509918
  0% 0/40 [00:00<?, ?it/s]Traceback (most recent call last):
  File "/usr/local/bin/fairseq-generate", line 8, in <module>
    sys.exit(cli_main())
  File "/usr/local/lib/python3.6/dist-packages/fairseq_cli/generate.py", line 188, in cli_main
    main(args)
  File "/usr/local/lib/python3.6/dist-packages/fairseq_cli/generate.py", line 106, in main
    hypos = task.inference_step(generator, models, sample, prefix_tokens)
  File "/content/drive/My Drive/MASS/mass/xmasked_seq2seq.py", line 434, in inference_step
    prefix_tokens=prefix_tokens,
  File "/usr/local/lib/python3.6/dist-packages/torch/autograd/grad_mode.py", line 15, in decorate_context
    return func(*args, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/fairseq/sequence_generator.py", line 397, in generate
    scores.view(bsz, beam_size, -1)[:, :, :step],
  File "/usr/local/lib/python3.6/dist-packages/fairseq/search.py", line 83, in step
    torch.div(self.indices_buf, vocab_size, out=self.beams_buf)
RuntimeError: Integer division of tensors using div or / is no longer supported, and in a future release div will perform true division as in Python 3. Use true_divide or floor_divide (// in Python) instead.

Kindly help me understanding the issue as i am not sure why code is getting failed at line number 434 of below python script.
xmasked_seq2seq.py.txt

As what value does prefix_token variable accepts for the given line of code i.e. 434.

Environment

  • fairseq Version (e.g., 1.0 or master): 0.7.0
  • PyTorch Version (e.g., 1.0) 1.6.0+cu101
  • OS (e.g., Linux):
  • How you installed fairseq (pip, source): pip
  • Build command you used (if compiling from source):
  • Python version:
  • CUDA/cuDNN version:
  • GPU models and configuration:
  • Any other relevant information:

Additional context

bug needs triage

Most helpful comment

Just update the code of search.py from 'torch.div(self.indices_buf, vocab_size, out=self.beams_buf)' to 'torch.floor_divide(self.indices_buf, vocab_size, out=self.beams_buf)'

It will work then.

All 3 comments

The "tcmalloc: large alloc " trace seems to be suggesting that there was an issue allocating memory and it died as a result. The python stacktrace is likely bogus since the error is resulting deep within native code.

I am getting the same error when using fairseq-interactive:

fairseq-interactive \ --path checkpoints/transformer/checkpoint_best.pt checkpoints/transformer/ \ --source-lang in --target-lang out --skip-invalid-size-inputs-valid-test

Traceback (most recent call last):
  File "/usr/local/bin/fairseq-interactive", line 8, in <module>
    sys.exit(cli_main())
  File "/usr/local/lib/python3.6/dist-packages/fairseq_cli/interactive.py", line 190, in cli_main
    main(args)
  File "/usr/local/lib/python3.6/dist-packages/fairseq_cli/interactive.py", line 149, in main
    translations = task.inference_step(generator, models, sample)
  File "/usr/local/lib/python3.6/dist-packages/fairseq/tasks/fairseq_task.py", line 265, in inference_step
    return generator.generate(models, sample, prefix_tokens=prefix_tokens)
  File "/usr/local/lib/python3.6/dist-packages/torch/autograd/grad_mode.py", line 15, in decorate_context
    return func(*args, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/fairseq/sequence_generator.py", line 113, in generate
    return self._generate(model, sample, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/torch/autograd/grad_mode.py", line 15, in decorate_context
    return func(*args, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/fairseq/sequence_generator.py", line 379, in _generate
    scores.view(bsz, beam_size, -1)[:, :, :step],
  File "/usr/local/lib/python3.6/dist-packages/fairseq/search.py", line 81, in step
    torch.div(self.indices_buf, vocab_size, out=self.beams_buf)
RuntimeError: Integer division of tensors using div or / is no longer supported, and in a future release div will perform true division as in Python 3. Use true_divide or floor_divide (// in Python) instead.

Just update the code of search.py from 'torch.div(self.indices_buf, vocab_size, out=self.beams_buf)' to 'torch.floor_divide(self.indices_buf, vocab_size, out=self.beams_buf)'

It will work then.

Was this page helpful?
0 / 5 - 0 ratings