Model I am using (Bert, XLNet....): Bert
Language I am using the model on (English, Chinese....): English
The problem arise when using:
The tasks I am working on is:
Steps to reproduce the behavior:
Running "run_lm_finetuning.py" with added tokens to vocabulary.
new_vocab_list = ['token_1', 'token_2', 'token_3']
tokenizer.add_tokens(new_vocab_list)
logger.info("vocabulary size after adding: " + str(len(tokenizer)))
model.resize_token_embeddings(len(tokenizer))
logger.info("size of model.cls.predictions.bias: " + str(len(model.cls.predictions.bias)))
I have found the problem to be: for BERT model, the class "BertLMPredictionHead" has two separate attributes "decoder" and "bias". When adding new tokens, the code "model.resize_token_embeddings(len(tokenizer))" only updates the size of "decoder" and its bias if it has (this bias is different from the "BertLMPredictionHead.bias"). The attribute "BertLMPredictionHead.bias" is not updated and therefore, causes the error.
I have added the updating-bias code in my "modeling_bert.py". And if you want, I can merge my branch to your code. However, if I misunderstand something, please notice me too.
Thank you very much for your code base.
Hi, I've pushed a fix that was just merged in master. Could you please try and install from source:
pip install git+https://github.com/huggingface/transformers
and tell me if you face the same error?
Having follow your reply from here (https://github.com/huggingface/transformers/issues/2513#issuecomment-574406370) it now works :)
Needed to update run_lm_finetuning.py to latest github branch - thanks :)
Hi @LysandreJik . Thank you for the update but the error has not been solved I'm afraid. Following are the error returned:
File "/sdcc/u/hvu/.conda/envs/torch/lib/python3.6/site-packages/torch/nn/modules/module.py", line 541, in __call__
result = self.forward(*input, **kwargs)
File "/sdcc/u/hvu/.conda/envs/torch/lib/python3.6/site-packages/torch/nn/parallel/data_parallel.py", line 152, in forward
outputs = self.parallel_apply(replicas, inputs, kwargs)
File "/sdcc/u/hvu/.conda/envs/torch/lib/python3.6/site-packages/torch/nn/parallel/data_parallel.py", line 162, in parallel_apply
return parallel_apply(replicas, inputs, kwargs, self.device_ids[:len(replicas)])
File "/sdcc/u/hvu/.conda/envs/torch/lib/python3.6/site-packages/torch/nn/parallel/parallel_apply.py", line 85, in parallel_apply
output.reraise()
File "/sdcc/u/hvu/.conda/envs/torch/lib/python3.6/site-packages/torch/_utils.py", line 385, in reraise
raise self.exc_type(msg)
RuntimeError: Caught RuntimeError in replica 0 on device 0.
Original Traceback (most recent call last):
File "/sdcc/u/hvu/.conda/envs/torch/lib/python3.6/site-packages/torch/nn/parallel/parallel_apply.py", line 60, in _worker
output = module(*input, **kwargs)
File "/sdcc/u/hvu/.conda/envs/torch/lib/python3.6/site-packages/torch/nn/modules/module.py", line 541, in __call__
result = self.forward(*input, **kwargs)
File "/sdcc/u/hvu/.conda/envs/torch/lib/python3.6/site-packages/transformers/modeling_bert.py", line 889, in forward
prediction_scores = self.cls(sequence_output)
File "/sdcc/u/hvu/.conda/envs/torch/lib/python3.6/site-packages/torch/nn/modules/module.py", line 541, in __call__
result = self.forward(*input, **kwargs)
File "/sdcc/u/hvu/.conda/envs/torch/lib/python3.6/site-packages/transformers/modeling_bert.py", line 461, in forward
prediction_scores = self.predictions(sequence_output)
File "/sdcc/u/hvu/.conda/envs/torch/lib/python3.6/site-packages/torch/nn/modules/module.py", line 541, in __call__
result = self.forward(*input, **kwargs)
File "/sdcc/u/hvu/.conda/envs/torch/lib/python3.6/site-packages/transformers/modeling_bert.py", line 451, in forward
hidden_states = self.decoder(hidden_states) + self.bias
RuntimeError: The size of tensor a (31119) must match the size of tensor b (31116) at non-singleton dimension 2
I have solved the problem myself by implementing this piece of code in the method def _tie_or_clone_weights(self, output_embeddings, input_embeddings) in _modeling_utils.py_:
# Update bias size if has attribuate bias
if hasattr(self, "cls"):
self.cls.predictions.bias.data = torch.nn.functional.pad(
self.cls.predictions.bias.data,
(0, self.config.vocab_size - self.cls.predictions.bias.shape[0]),
"constant",
0,
)
@HuyVu0508 Try update this file
https://github.com/huggingface/transformers/blob/master/src/transformers/modeling_bert.py
It should be somewhere "/opt/conda/lib/python3.6/site-packages/transformers/modeling_bert.py"
Looks like this is probably a duplicate of #1730
Also, there is a temp solution posted here.
https://github.com/huggingface/transformers/issues/1730#issuecomment-550081307
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
Most helpful comment
Hi, I've pushed a fix that was just merged in
master. Could you please try and install from source:and tell me if you face the same error?