Model I am using RobertaForSequenceClassification and when I tried to load 'roberta-base' model using this code on Google Colab:
```from transformers import RobertaForSequenceClassification, RobertaConfig
config = RobertaConfig()
model = RobertaForSequenceClassification.from_pretrained(
"roberta-base", config = config)
model
I get the following error:
RuntimeError: Error(s) in loading state_dict for RobertaForSequenceClassification:
size mismatch for roberta.embeddings.word_embeddings.weight: copying a param with shape torch.Size([50265, 768]) from checkpoint, the shape in current model is torch.Size([30522, 768]).
size mismatch for roberta.embeddings.position_embeddings.weight: copying a param with shape torch.Size([514, 768]) from checkpoint, the shape in current model is torch.Size([512, 768]).
size mismatch for roberta.embeddings.token_type_embeddings.weight: copying a param with shape torch.Size([1, 768]) from checkpoint, the shape in current model is torch.Size([2, 768]).
```
Maybe related to #1340
Hi! You're initializing RoBERTa with a blank configuration, which results in a very BERT-like configuration. BERT has different attributes than RoBERTa (different vocabulary size, positional embeddings size etc) so this indeed results in an error.
To instantiate RoBERTa you can simply do:
model = RobertaForSequenceClassification.from_pretrained("roberta-base")
If you wish to have a configuration file so that you can change attributes like outputting the hidden states, you could do it like this:
config = RobertaConfig.from_pretrained("roberta-base", output_hidden_states=True)
model = RobertaForSequenceClassification.from_pretrained("roberta-base", config=config)
Hi @LysandreJik ,
Thanks a lot for the clarification, this is indeed much clearer. I tried the code again and it is working.
Most helpful comment
Hi @LysandreJik ,
Thanks a lot for the clarification, this is indeed much clearer. I tried the code again and it is working.