Hi,
I often get this error:
File "/miniconda3/envs/brightwater/lib/python3.6/site-packages/pytorch_pretrained_bert/modeling.py", line 268, in forward
position_embeddings = self.position_embeddings(position_ids)
File "/miniconda3/envs/brightwater/lib/python3.6/site-packages/torch/nn/modules/module.py", line 493, in __call__
result = self.forward(*input, **kwargs)
File "/miniconda3/envs/brightwater/lib/python3.6/site-packages/torch/nn/modules/sparse.py", line 117, in forward
self.norm_type, self.scale_grad_by_freq, self.sparse)
File "/miniconda3/envs/brightwater/lib/python3.6/site-packages/torch/nn/functional.py", line 1506, in embedding
return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse)
RuntimeError: index out of range at ../aten/src/TH/generic/THTensorEvenMoreMath.cpp:193
It only happens for long texts. It doesn't fail on chunks of a long text that is failing.
Is there a limitation on the length of the input text?
Yes, 512 tokens for Bert.
Thank you :)
Is there a way to bypass this limit? To increase the number of words?
Most helpful comment
Is there a way to bypass this limit? To increase the number of words?