Transformers: Long BERT TypeError: forward() takes from 2 to 4 positional arguments but 7 were given

Created on 14 Jul 2020  路  4Comments  路  Source: huggingface/transformers

I'm having an issue on the pretraining of a BERT-like model. I used the following function twice: the first time with bert-base-multilingual-cased and the second time with a simil version, but more efficient for long documents, exploiting the class LongformerSelfAttention to make the normal BERT into a LongBERT.

def pretrain_and_evaluate(args, model, tokenizer, eval_only, model_path):
    val_dataset = TextDataset(tokenizer=tokenizer,
                              file_path=args.val_datapath,
                              block_size=tokenizer.max_len)
    if eval_only:
        train_dataset = val_dataset
    else:
        logger.info(f'Loading and tokenizing training data is usually slow: {args.train_datapath}')
        train_dataset = TextDataset(tokenizer=tokenizer,
                                    file_path=args.train_datapath,
                                    block_size=tokenizer.max_len)

    data_collator = DataCollatorForLanguageModeling(tokenizer=tokenizer, mlm=True, mlm_probability=0.15)
    trainer = Trainer(model=model, args=args, data_collator=data_collator,
                      train_dataset=train_dataset, eval_dataset=val_dataset, prediction_loss_only=True,)

    eval_loss = trainer.evaluate()
    eval_loss = eval_loss['eval_loss']
    logger.info(f'Initial eval bpc: {eval_loss/math.log(2)}')

    if not eval_only:
        trainer.train(model_path=model_path)
        trainer.save_model()

        eval_loss = trainer.evaluate()
        eval_loss = eval_loss['eval_loss']
        logger.info(f'Eval bpc after pretraining: {eval_loss/math.log(2)}')

With the bert-base-multilingual-cased it works well: model and tokenizer passed as arguments to the function are respectively:

model = BertForMaskedLM.from_pretrained('bert-base-multilingual-cased')
tokenizer = BertTokenizerFast.from_pretrained('bert-base-multilingual-cased')

But with the modified version of BERT this error occours:

Traceback (most recent call last):
  File "convert_bert_to_long_bert.py", line 172, in <module>
    pretrain_and_evaluate(training_args, model, tokenizer, eval_only=False, model_path=training_args.output_dir)
  File "convert_bert_to_long_bert.py", line 86, in pretrain_and_evaluate
    eval_loss = trainer.evaluate()
  File "/Users/user/Library/Python/3.7/lib/python/site-packages/transformers/trainer.py", line 748, in evaluate
    output = self._prediction_loop(eval_dataloader, description="Evaluation")
  File "/Users/user/Library/Python/3.7/lib/python/site-packages/transformers/trainer.py", line 829, in _prediction_loop
    outputs = model(**inputs)
  File "/Users/user/Library/Python/3.7/lib/python/site-packages/torch/nn/modules/module.py", line 550, in __call__
    result = self.forward(*input, **kwargs)
  File "/Users/user/Library/Python/3.7/lib/python/site-packages/transformers/modeling_bert.py", line 1098, in forward
    return_tuple=return_tuple,
  File "/Users/user/Library/Python/3.7/lib/python/site-packages/torch/nn/modules/module.py", line 550, in __call__
    result = self.forward(*input, **kwargs)
  File "/Users/user/Library/Python/3.7/lib/python/site-packages/transformers/modeling_bert.py", line 799, in forward
    return_tuple=return_tuple,
  File "/Users/user/Library/Python/3.7/lib/python/site-packages/torch/nn/modules/module.py", line 550, in __call__
    result = self.forward(*input, **kwargs)
  File "/Users/user/Library/Python/3.7/lib/python/site-packages/transformers/modeling_bert.py", line 460, in forward
    output_attentions,
  File "/Users/user/Library/Python/3.7/lib/python/site-packages/torch/nn/modules/module.py", line 550, in __call__
    result = self.forward(*input, **kwargs)
  File "/Users/user/Library/Python/3.7/lib/python/site-packages/transformers/modeling_bert.py", line 391, in forward
    hidden_states, attention_mask, head_mask, output_attentions=output_attentions,
  File "/Users/user/Library/Python/3.7/lib/python/site-packages/torch/nn/modules/module.py", line 550, in __call__
    result = self.forward(*input, **kwargs)
  File "/Users/user/Library/Python/3.7/lib/python/site-packages/transformers/modeling_bert.py", line 335, in forward
    hidden_states, attention_mask, head_mask, encoder_hidden_states, encoder_attention_mask, output_attentions,
  File "/Users/user/Library/Python/3.7/lib/python/site-packages/torch/nn/modules/module.py", line 550, in __call__
    result = self.forward(*input, **kwargs)
TypeError: forward() takes from 2 to 4 positional arguments but 7 were given

I did few modifications to a working script to obtain a Long version of RoBERTa given the RoBERTa base model. What could be the mistake?

wontfix

Most helpful comment

The code in Longformer has changed quite a bit. I think a simply remedy to make your code work with the current version of Longformer is to add **kwargs to every forward function in modeling_longformer.py that you copied into your notebook. This way it can handle an arbitrary number of input arguments and the above error should not occur.

All 4 comments

Update: I have downgraded transformers to the version transformers==2.11.0 and it seems working, even if for now I have used little datasets for test. I will update this issue if someone is interested

The code in Longformer has changed quite a bit. I think a simply remedy to make your code work with the current version of Longformer is to add **kwargs to every forward function in modeling_longformer.py that you copied into your notebook. This way it can handle an arbitrary number of input arguments and the above error should not occur.

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

The code in Longformer has changed quite a bit. I think a simply remedy to make your code work with the current version of Longformer is to add **kwargs to every forward function in modeling_longformer.py that you copied into your notebook. This way it can handle an arbitrary number of input arguments and the above error should not occur.

EDIT: To begin pre-training, make sure you LOAD the saved model exactly the way the notebook does BEFORE pre-training! Don't try and use the model straightaway!

Was this page helpful?
0 / 5 - 0 ratings