Transformers: Segmentation fault when trying to load models

Created on 25 Jun 2020 · 7Comments · Source: huggingface/transformers

We are using Azure ML pipelines to train our transformers models. We have had it working for a few weeks, and then recently (just noticed it a few days ago), when trying to initialize a model, we are getting Segmentation fault.

I tried just loading the models locally this morning and have the same issues. See snippet below.

config = config_class.from_pretrained(model_name, num_labels=10)
tokenizer = tokenizer_class.from_pretrained(model_name, do_lower_case=False)
model = model_class.from_pretrained("distilroberta-base", from_tf=False, config=config)

I also tried to download the *_model.bin and pass a local path instead of the model name and also got a Segmentation fault. I also tried to use bert-base-uncased instead of distilroberta-base and had the same issue.

I am running on Ubuntu, with the following package versions:

torch==1.3.0
tokenizers=0.0.11
transformers==2.4.1

UPDATE:

I hacked some example scripts and had success, so I think the issue is that our code uses...

    "roberta": (RobertaConfig, RobertaForTokenClassification, RobertaTokenizer),
    "mroberta": (RobertaConfig, RobertaForMultiLabelTokenClassification, RobertaTokenizer),    # our custom multilabel class

instead of what the example scripts use...

    AutoConfig,
    AutoModelForTokenClassification,
    AutoTokenizer,

Was there a breaking change to model files recently that would mean that our use of the "non-auto" classes are no longer usable?

UPDATE 2:

Our original code does not cause a Segmentation fault on Windows.

Source

michaelcapizzi

Most helpful comment

Downgrade to sentencepiece==0.1.91 solve it.
I am using PyTorch 1.2.0 + transformers3.0.0

chmod-700 on 2 Jul 2020

❤5 🚀4 😄4 👍4 🎉3

All 7 comments

Bumping to torch==1.5.1 fixes this issue. But it's still unclear why.

michaelcapizzi on 25 Jun 2020

👀3

I have also met the same issue and upgrading to torch1.5.1 also solves my problem.

jsw-zorro on 30 Jun 2020

LysandreJik on 1 Jul 2020

Downgrade to sentencepiece==0.1.91 solve it.
I am using PyTorch 1.2.0 + transformers3.0.0

chmod-700 on 2 Jul 2020

❤5 🚀4 😄4 👍4 🎉3

Downgrade to sentencepiece==0.1.91 solve it.
I am using PyTorch 1.2.0 + transformers3.0.0

Also PyTorch 1.4.0 + transformers 3.0.2

drevicko on 22 Jul 2020

Closing this as solved by #5418. Feel free to re-open if you still face an issue.

LysandreJik on 30 Jul 2020

For me either adding sentencepiece==0.1.91 + torch==1.3.1 + transformers==2.4.1
or torch==1.5.1 + transformers==2.4.1 worked.

pramitchoudhary on 27 Oct 2020

Was this page helpful?

0 / 5 - 0 ratings

Related issues

Weights not initialized from pretrained model

lemonhu · 3Comments

What should be the label of sub-word units in Token Classification with Bert

ereday · 3Comments

_load_from_state_dict() takes 7 positional arguments but 8 were given

guanlongtianzi · 3Comments

Unseen Vocab

siddsach · 3Comments

fp16+xlnet did not gain any speed increase

fyubang · 3Comments