Overview:
I am using the Bert pre-trained model and trying to finetune it using a customized dataset which requires me to add new tokens so that the tokenizer doesn't wordpiece them (these tokens are of the form <1234> and 1234> where 1234 can be any int converted to string).
I was able to go through the train step but when it comes to evaluating the perplexity I get :
RuntimeError: CUDA error: CUBLAS_STATUS_NOT_INITIALIZED when calling cublasCreate(handle)
Model I am using (Bert, XLNet ...): Bert
Language I am using the model on (English, Chinese ...): English
The problem arises when using:
The tasks I am working on is:
Steps to reproduce the behavior:
Error:
Exception has occurred: RuntimeError
Caught RuntimeError in replica 0 on device 0.
Original Traceback (most recent call last):
File "/data/nisoni/anaconda3/envs/trans/lib/python3.6/site-packages/torch/nn/parallel/parallel_apply.py", line 60, in _worker
output = module(input, *kwargs)
File "/data/nisoni/anaconda3/envs/trans/lib/python3.6/site-packages/torch/nn/modules/module.py", line 532, in __call__
result = self.forward(input, *kwargs)
File "/data/nisoni/anaconda3/envs/trans/lib/python3.6/site-packages/transformers/modeling_bert.py", line 987, in forward
encoder_attention_mask=encoder_attention_mask,
File "/data/nisoni/anaconda3/envs/trans/lib/python3.6/site-packages/torch/nn/modules/module.py", line 532, in __call__
result = self.forward(input, *kwargs)
File "/data/nisoni/anaconda3/envs/trans/lib/python3.6/site-packages/transformers/modeling_bert.py", line 790, in forward
encoder_attention_mask=encoder_extended_attention_mask,
File "/data/nisoni/anaconda3/envs/trans/lib/python3.6/site-packages/torch/nn/modules/module.py", line 532, in __call__
result = self.forward(input, *kwargs)
File "/data/nisoni/anaconda3/envs/trans/lib/python3.6/site-packages/transformers/modeling_bert.py", line 407, in forward
hidden_states, attention_mask, head_mask[i], encoder_hidden_states, encoder_attention_mask
File "/data/nisoni/anaconda3/envs/trans/lib/python3.6/site-packages/torch/nn/modules/module.py", line 532, in __call__
result = self.forward(input, *kwargs)
File "/data/nisoni/anaconda3/envs/trans/lib/python3.6/site-packages/transformers/modeling_bert.py", line 368, in forward
self_attention_outputs = self.attention(hidden_states, attention_mask, head_mask)
File "/data/nisoni/anaconda3/envs/trans/lib/python3.6/site-packages/torch/nn/modules/module.py", line 532, in __call__
result = self.forward(input, *kwargs)
File "/data/nisoni/anaconda3/envs/trans/lib/python3.6/site-packages/transformers/modeling_bert.py", line 314, in forward
hidden_states, attention_mask, head_mask, encoder_hidden_states, encoder_attention_mask
File "/data/nisoni/anaconda3/envs/trans/lib/python3.6/site-packages/torch/nn/modules/module.py", line 532, in __call__
result = self.forward(input, *kwargs)
File "/data/nisoni/anaconda3/envs/trans/lib/python3.6/site-packages/transformers/modeling_bert.py", line 216, in forward
mixed_query_layer = self.query(hidden_states)
File "/data/nisoni/anaconda3/envs/trans/lib/python3.6/site-packages/torch/nn/modules/module.py", line 532, in __call__
result = self.forward(input, *kwargs)
File "/data/nisoni/anaconda3/envs/trans/lib/python3.6/site-packages/torch/nn/modules/linear.py", line 87, in forward
return F.linear(input, self.weight, self.bias)
File "/data/nisoni/anaconda3/envs/trans/lib/python3.6/site-packages/torch/nn/functional.py", line 1372, in linear
output = input.matmul(weight.t())
RuntimeError: CUDA error: CUBLAS_STATUS_NOT_INITIALIZED when calling cublasCreate(handle)
File "/data/nisoni/transformers/transformers/examples/run_language_modeling.py", line 550, in evaluate
outputs = model(inputs, masked_lm_labels=labels) if args.mlm else model(inputs, labels=labels)
File "/data/nisoni/transformers/transformers/examples/run_language_modeling.py", line 910, in main
result = evaluate(args, model, tokenizer, prefix=prefix)
File "/data/nisoni/transformers/transformers/examples/run_language_modeling.py", line 918, in
main()
A regular examples run giving a perplexity score as it gives without adding new tokens
transformers
version: 2.5.1Tried debugging with CPU (an aside - this has an issue in itself apparently when --no_cuda flag is used --> run_language_modeling.py needs to set args.n_gpu to 0)
Found the fix -> Needed to call model.resize_token_embeddings(len(tokenizer)) after adding tokens in the eval mode as well.
Most helpful comment
Tried debugging with CPU (an aside - this has an issue in itself apparently when --no_cuda flag is used --> run_language_modeling.py needs to set args.n_gpu to 0)
Found the fix -> Needed to call model.resize_token_embeddings(len(tokenizer)) after adding tokens in the eval mode as well.