Flair: Prediction bug for NER models trained with TransformerWordEmbeddings

Created on 6 Jul 2020 · 5Comments · Source: flairNLP/flair

Describe the bug
I have trained a TransformerWordEmbeddings NER model using a custom RoBERTa model. Training completed successfully. But when loading the NER model for prediction,

model = SequenceTagger.load('resources/taggers/roberta/final-model.pt')
....

model.predict(sentence)

the following error occured:

````

AttributeError Traceback (most recent call last)
in
8
9 # predict tags and print
---> 10 model.predict(sentence)
11
12 print(sentence.to_tagged_string())

~/miniconda3/envs/pytorch_p37/lib/python3.7/site-packages/flair/models/sequence_tagger_model.py in predict(self, sentences, mini_batch_size, all_tag_prob, verbose, label_name, return_loss, embedding_storage_mode)
364 continue
365
--> 366 feature = self.forward(batch)
367
368 if return_loss:

~/miniconda3/envs/pytorch_p37/lib/python3.7/site-packages/flair/models/sequence_tagger_model.py in forward(self, sentences)
602 def forward(self, sentences: List[Sentence]):
603
--> 604 self.embeddings.embed(sentences)
605
606 names = self.embeddings.get_names()

~/miniconda3/envs/pytorch_p37/lib/python3.7/site-packages/flair/embeddings/token.py in embed(self, sentences, static_embeddings)
69
70 for embedding in self.embeddings:
---> 71 embedding.embed(sentences)
72
73 @property

~/miniconda3/envs/pytorch_p37/lib/python3.7/site-packages/flair/embeddings/base.py in embed(self, sentences)
59
60 if not everything_embedded or not self.static_embeddings:
---> 61 self._add_embeddings_internal(sentences)
62
63 return sentences

~/miniconda3/envs/pytorch_p37/lib/python3.7/site-packages/flair/embeddings/token.py in _add_embeddings_internal(self, sentences)
895 # embed each micro-batch
896 for batch in sentence_batches:
--> 897 self._add_embeddings_to_sentences(batch)
898
899 return sentences

~/miniconda3/envs/pytorch_p37/lib/python3.7/site-packages/flair/embeddings/token.py in _add_embeddings_to_sentences(self, sentences)
1007
1008 # put encoded batch through transformer model to get all hidden states of all encoder layers
-> 1009 hidden_states = self.model(input_ids, attention_mask=mask)[-1]
1010 # make the tuple a tensor; makes working with it easier.
1011 hidden_states = torch.stack(hidden_states)

~/miniconda3/envs/pytorch_p37/lib/python3.7/site-packages/torch/nn/modules/module.py in __call__(self, input, *kwargs)
545 result = self._slow_forward(input, *kwargs)
546 else:
--> 547 result = self.forward(input, *kwargs)
548 for hook in self._forward_hooks.values():
549 hook_result = hook(self, input, result)

~/miniconda3/envs/pytorch_p37/lib/python3.7/site-packages/transformers/modeling_bert.py in forward(self, input_ids, attention_mask, token_type_ids, position_ids, head_mask, inputs_embeds, encoder_hidden_states, encoder_attention_mask, output_attentions, output_hidden_states)
760 encoder_attention_mask=encoder_extended_attention_mask,
761 output_attentions=output_attentions,
--> 762 output_hidden_states=output_hidden_states,
763 )
764 sequence_output = encoder_outputs[0]

~/miniconda3/envs/pytorch_p37/lib/python3.7/site-packages/transformers/modeling_bert.py in forward(self, hidden_states, attention_mask, head_mask, encoder_hidden_states, encoder_attention_mask, output_attentions, output_hidden_states)
414 all_hidden_states = all_hidden_states + (hidden_states,)
415
--> 416 if getattr(self.config, "gradient_checkpointing", False):
417
418 def create_custom_forward(module):

~/miniconda3/envs/pytorch_p37/lib/python3.7/site-packages/torch/nn/modules/module.py in __getattr__(self, name)
589 return modules[name]
590 raise AttributeError("'{}' object has no attribute '{}'".format(
--> 591 type(self).__name__, name))
592
593 def __setattr__(self, name, value):

AttributeError: 'BertEncoder' object has no attribute 'config'
````

Expected behavior
NER tagged sentence

Environment (please complete the following information):

OS Linux:

bug

Source

seyyaw

👍1

All 5 comments

Hi @seyyaw did you update the transformers version in the meantime? We saw a similar problem, which can be fixed by reloading the transformer embeddings and saving the model again:

import flair
from flair.models import SequenceTagger
from transformers import AutoConfig, AutoModel

# load your tagger
tagger = SequenceTagger.load('resources/taggers/roberta/final-model.pt')

# get name of embedding
transformer_model_name = '-'.join(tagger.embeddings.name.split('-')[2:])
print(transformer_model_name)

# reload transformer embedding
config = AutoConfig.from_pretrained(transformer_model_name, output_hidden_states=True)
tagger.embeddings.model = AutoModel.from_pretrained(transformer_model_name, config=config)

# to be save, send tagger to flair devoce
tagger.to(flair.device)

# save the updated model
tagger.save('resources/taggers/roberta/final-model-updated.pt')

alanakbik on 6 Jul 2020

👍1

Thank you very much @alanakbik . I build my model after updating my transformer to the latest release. Now it works!!

seyyaw on 10 Jul 2020

hi @seyyaw can you please tell me how you used RoBERTa word embeddings,
I have trained the language model using huggingface for my language from scratch, and I have a repo containing config.json, merges.txt, optimizer.pt, pytorch_model.bin, scheduler.pt, training_args.bin, vocab.json
which I got after training the language model,
Now how should I use this in word embeddings to train NER mode and then for prediction?

himanshudce on 4 Aug 2020

Hi @himanshudce. I have followed one of the code examples from Flair. Below is how I have used the RoBERTa model that I have trained form scratch

embedding_types= [
         TransformerWordEmbeddings(model ='/PATH/TO/MyRoBERTa',  layers = '-1,-2,-3,-4', fine_tune = True, batch_size = 8,use_scalar_mix = False)
         ]

seyyaw on 4 Aug 2020

hey, thanks, actually I also used somewhat similar, also tried same but getting a this warning
"Asking to truncate to max_length but no maximum length is provided and the model has no predefined maximum length. Default to no truncation."
and then the Index Error -
"
Traceback (most recent call last):
File "FLAIR/flair_POS_trainer.py", line 64, in
trainer.train('FLAIR/resources/taggers/flairposbert',
File "/home/himanshu/.local/lib/python3.8/site-packages/flair/trainers/trainer.py", line 349, in train
loss = self.model.forward_loss(batch_step)
File "/home/himanshu/.local/lib/python3.8/site-packages/flair/models/sequence_tagger_model.py", line 599, in forward_loss
features = self.forward(data_points)
File "/home/himanshu/.local/lib/python3.8/site-packages/flair/models/sequence_tagger_model.py", line 604, in forward
self.embeddings.embed(sentences)
File "/home/himanshu/.local/lib/python3.8/site-packages/flair/embeddings/token.py", line 71, in embed
embedding.embed(sentences)
File "/home/himanshu/.local/lib/python3.8/site-packages/flair/embeddings/base.py", line 61, in embed
self._add_embeddings_internal(sentences)
File "/home/himanshu/.local/lib/python3.8/site-packages/flair/embeddings/token.py", line 897, in _add_embeddings_internal
self._add_embeddings_to_sentences(batch)
File "/home/himanshu/.local/lib/python3.8/site-packages/flair/embeddings/token.py", line 1009, in _add_embeddings_to_sentences
hidden_states = self.model(input_ids, attention_mask=mask)[-1]
File "/home/himanshu/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 550, in __call__
result = self.forward(input, *kwargs)
File "/home/himanshu/.local/lib/python3.8/site-packages/transformers/modeling_bert.py", line 752, in forward
embedding_output = self.embeddings(
File "/home/himanshu/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 550, in __call__
result = self.forward(input, *kwargs)
File "/home/himanshu/.local/lib/python3.8/site-packages/transformers/modeling_roberta.py", line 67, in forward
return super().forward(
File "/home/himanshu/.local/lib/python3.8/site-packages/transformers/modeling_bert.py", line 179, in forward
position_embeddings = self.position_embeddings(position_ids)
File "/home/himanshu/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 550, in __call__
result = self.forward(input, *kwargs)
File "/home/himanshu/.local/lib/python3.8/site-packages/torch/nn/modules/sparse.py", line 112, in forward
return F.embedding(
File "/home/himanshu/.local/lib/python3.8/site-packages/torch/nn/functional.py", line 1724, in embedding
return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse)
IndexError: index out of range in self
"