Transformers: Segmentation fault when loading pretrained file

Created on 15 Jun 2020  Â·  7Comments  Â·  Source: huggingface/transformers

When loading model weights file using , segmentation fault error occur..

Most helpful comment

I am having the same problem - I had my code working about a week ago. Then, I installed the library (pip install transformers) on a new machine and now it crashes when I try to load any pre-trained model (e.g BertModel.from_pretrained('bert-base-uncased')).
I tried downgrading to 2.9.1 and 2.10 and the problem persisted.

PyTorch version: 1.4.0
GPU type: 'Tesla V100-SXM2-16GB'

Downgrade sentencepiece to 0.1.91. This worked for me after being stuck at the same problem as yours

All 7 comments

I am having the same problem - I had my code working about a week ago. Then, I installed the library (pip install transformers) on a new machine and now it crashes when I try to load any pre-trained model (e.g BertModel.from_pretrained('bert-base-uncased')).
I tried downgrading to 2.9.1 and 2.10 and the problem persisted.

PyTorch version: 1.4.0
GPU type: 'Tesla V100-SXM2-16GB'

This is the full debug log output:

DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): s3.amazonaws.com:443
DEBUG:urllib3.connectionpool:https://s3.amazonaws.com:443 "HEAD /models.huggingface.co/bert/bert-base-uncased-config.json HTTP/1.1" 200 0
DEBUG:filelock:Attempting to acquire lock 140382636847512 on /home/ec2-user/.cache/torch/transformers/4dad0251492946e18ac39290fcfe91b89d370fee250efe9521476438fe8ca185.7156163d5fdc189c3016baca0775ffce230789d7fa2a42ef516483e4ca884517.lock
INFO:filelock:Lock 140382636847512 acquired on /home/ec2-user/.cache/torch/transformers/4dad0251492946e18ac39290fcfe91b89d370fee250efe9521476438fe8ca185.7156163d5fdc189c3016baca0775ffce230789d7fa2a42ef516483e4ca884517.lock
INFO:transformers.file_utils:https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-config.json not found in cache or force_download set to True, downloading to /home/ec2-user/.cache/torch/transformers/tmpppid3hrz
DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): s3.amazonaws.com:443
DEBUG:urllib3.connectionpool:https://s3.amazonaws.com:443 "GET /models.huggingface.co/bert/bert-base-uncased-config.json HTTP/1.1" 200 433
HBox(children=(FloatProgress(value=0.0, description='Downloading', max=433.0, style=ProgressStyle(description_…
INFO:transformers.file_utils:storing https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-config.json in cache at /home/ec2-user/.cache/torch/transformers/4dad0251492946e18ac39290fcfe91b89d370fee250efe9521476438fe8ca185.7156163d5fdc189c3016baca0775ffce230789d7fa2a42ef516483e4ca884517
INFO:transformers.file_utils:creating metadata file for /home/ec2-user/.cache/torch/transformers/4dad0251492946e18ac39290fcfe91b89d370fee250efe9521476438fe8ca185.7156163d5fdc189c3016baca0775ffce230789d7fa2a42ef516483e4ca884517
DEBUG:filelock:Attempting to release lock 140382636847512 on /home/ec2-user/.cache/torch/transformers/4dad0251492946e18ac39290fcfe91b89d370fee250efe9521476438fe8ca185.7156163d5fdc189c3016baca0775ffce230789d7fa2a42ef516483e4ca884517.lock
INFO:filelock:Lock 140382636847512 released on /home/ec2-user/.cache/torch/transformers/4dad0251492946e18ac39290fcfe91b89d370fee250efe9521476438fe8ca185.7156163d5fdc189c3016baca0775ffce230789d7fa2a42ef516483e4ca884517.lock
INFO:transformers.configuration_utils:loading configuration file https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-config.json from cache at /home/ec2-user/.cache/torch/transformers/4dad0251492946e18ac39290fcfe91b89d370fee250efe9521476438fe8ca185.7156163d5fdc189c3016baca0775ffce230789d7fa2a42ef516483e4ca884517
INFO:transformers.configuration_utils:Model config BertConfig {
  "architectures": [
    "BertForMaskedLM"
  ],
  "attention_probs_dropout_prob": 0.1,
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 768,
  "initializer_range": 0.02,
  "intermediate_size": 3072,
  "layer_norm_eps": 1e-12,
  "max_position_embeddings": 512,
  "model_type": "bert",
  "num_attention_heads": 12,
  "num_hidden_layers": 12,
  "pad_token_id": 0,
  "type_vocab_size": 2,
  "vocab_size": 30522
}

DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): cdn.huggingface.co:443
DEBUG:urllib3.connectionpool:https://cdn.huggingface.co:443 "HEAD /bert-base-uncased-pytorch_model.bin HTTP/1.1" 200 0
DEBUG:filelock:Attempting to acquire lock 140384545811816 on /home/ec2-user/.cache/torch/transformers/f2ee78bdd635b758cc0a12352586868bef80e47401abe4c4fcc3832421e7338b.36ca03ab34a1a5d5fa7bc3d03d55c4fa650fed07220e2eeebc06ce58d0e9a157.lock
INFO:filelock:Lock 140384545811816 acquired on /home/ec2-user/.cache/torch/transformers/f2ee78bdd635b758cc0a12352586868bef80e47401abe4c4fcc3832421e7338b.36ca03ab34a1a5d5fa7bc3d03d55c4fa650fed07220e2eeebc06ce58d0e9a157.lock
INFO:transformers.file_utils:https://cdn.huggingface.co/bert-base-uncased-pytorch_model.bin not found in cache or force_download set to True, downloading to /home/ec2-user/.cache/torch/transformers/tmp9qzw5qor
DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): cdn.huggingface.co:443

DEBUG:urllib3.connectionpool:https://cdn.huggingface.co:443 "GET /bert-base-uncased-pytorch_model.bin HTTP/1.1" 200 440473133
HBox(children=(FloatProgress(value=0.0, description='Downloading', max=440473133.0, style=ProgressStyle(descri…
INFO:transformers.file_utils:storing https://cdn.huggingface.co/bert-base-uncased-pytorch_model.bin in cache at /home/ec2-user/.cache/torch/transformers/f2ee78bdd635b758cc0a12352586868bef80e47401abe4c4fcc3832421e7338b.36ca03ab34a1a5d5fa7bc3d03d55c4fa650fed07220e2eeebc06ce58d0e9a157
INFO:transformers.file_utils:creating metadata file for /home/ec2-user/.cache/torch/transformers/f2ee78bdd635b758cc0a12352586868bef80e47401abe4c4fcc3832421e7338b.36ca03ab34a1a5d5fa7bc3d03d55c4fa650fed07220e2eeebc06ce58d0e9a157
DEBUG:filelock:Attempting to release lock 140384545811816 on /home/ec2-user/.cache/torch/transformers/f2ee78bdd635b758cc0a12352586868bef80e47401abe4c4fcc3832421e7338b.36ca03ab34a1a5d5fa7bc3d03d55c4fa650fed07220e2eeebc06ce58d0e9a157.lock
INFO:filelock:Lock 140384545811816 released on /home/ec2-user/.cache/torch/transformers/f2ee78bdd635b758cc0a12352586868bef80e47401abe4c4fcc3832421e7338b.36ca03ab34a1a5d5fa7bc3d03d55c4fa650fed07220e2eeebc06ce58d0e9a157.lock

I am having the same problem - I had my code working about a week ago. Then, I installed the library (pip install transformers) on a new machine and now it crashes when I try to load any pre-trained model (e.g BertModel.from_pretrained('bert-base-uncased')).
I tried downgrading to 2.9.1 and 2.10 and the problem persisted.

PyTorch version: 1.4.0
GPU type: 'Tesla V100-SXM2-16GB'

Thanks for your reply, How did you make your code work before?

So, I just realized that today I was using a different version of PyTorch (I am working with Amazon's SageMaker and had to spin up a new instance).
I upgraded my PyTorch to 10.5 and cudatools to 10.2 (conda install pytorch torchvision cudatoolkit=10.2 -c pytorch) and it just started working again..hope that helps

@akornilo There is something wrong with my pytorch,turning pytorch version to 1.5+cuda9.2 makes it works. Thx for your advice.

Glad you could resolve your issue! Feel free to reopen if you see the same issue down the road.

I am having the same problem - I had my code working about a week ago. Then, I installed the library (pip install transformers) on a new machine and now it crashes when I try to load any pre-trained model (e.g BertModel.from_pretrained('bert-base-uncased')).
I tried downgrading to 2.9.1 and 2.10 and the problem persisted.

PyTorch version: 1.4.0
GPU type: 'Tesla V100-SXM2-16GB'

Downgrade sentencepiece to 0.1.91. This worked for me after being stuck at the same problem as yours

Was this page helpful?
0 / 5 - 0 ratings

Related issues

HansBambel picture HansBambel  Â·  3Comments

siddsach picture siddsach  Â·  3Comments

0x01h picture 0x01h  Â·  3Comments

alphanlp picture alphanlp  Â·  3Comments

hsajjad picture hsajjad  Â·  3Comments