Transformers: RuntimeError: unexpected EOF, expected 7491165 more bytes. The file might be corrupted.

Created on 11 Oct 2019 · 7Comments · Source: huggingface/transformers

❓ Questions & Help

I tried a small chunk of code from the Readme.md

import torch
from transformers import *
MODELS = [(BertModel,       BertTokenizer,       'bert-base-uncased')]
for model_class, tokenizer_class, pretrained_weights in MODELS:
    # Load pretrained model/tokenizer
    tokenizer = tokenizer_class.from_pretrained(pretrained_weights)
    model = model_class.from_pretrained(pretrained_weights)
    input_ids = torch.tensor([tokenizer.encode("Here is some text to encode", add_special_tokens=True)])  # Add special tokens takes care of adding [CLS], [SEP], <s>... tokens in the right way for each model.
    with torch.no_grad():
        last_hidden_states = model(input_ids)[0]

It is giving me the following error

RuntimeError                              Traceback (most recent call last)
<ipython-input-3-6528fe9b0472> in <module>
      3     tokenizer = tokenizer_class.from_pretrained(pretrained_weights)
----> 4     model = model_class.from_pretrained(pretrained_weights)

~/.conda/envs/transformers/lib/python3.7/site-packages/transformers/modeling_utils.py in from_pretrained(cls, pretrained_model_name_or_path, *model_args, **kwargs)
    343 
    344         if state_dict is None and not from_tf:
--> 345             state_dict = torch.load(resolved_archive_file, map_location='cpu')
    346 
    347         missing_keys = []

~/.conda/envs/transformers/lib/python3.7/site-packages/torch/serialization.py in load(f, map_location, pickle_module, **pickle_load_args)
    424         if sys.version_info >= (3, 0) and 'encoding' not in pickle_load_args.keys():
    425             pickle_load_args['encoding'] = 'utf-8'
--> 426         return _load(f, map_location, pickle_module, **pickle_load_args)
    427     finally:
    428         if new_fd:

~/.conda/envs/transformers/lib/python3.7/site-packages/torch/serialization.py in _load(f, map_location, pickle_module, **pickle_load_args)
    618     for key in deserialized_storage_keys:
    619         assert key in deserialized_objects
--> 620         deserialized_objects[key]._set_from_file(f, offset, f_should_read_directly)
    621         if offset is not None:
    622             offset = f.tell()

RuntimeError: unexpected EOF, expected 7491165 more bytes. The file might be corrupted.

Haven't modified anything in the library.

Source

amankedia

Most helpful comment

Hi! It seems to me that the file that was downloaded was corrupted, probably because of lacking space or a network error. Could you try using the from_pretrained with the force_download option ?

LysandreJik on 11 Oct 2019

👍6 🎉1

All 7 comments

Hi! It seems to me that the file that was downloaded was corrupted, probably because of lacking space or a network error. Could you try using the from_pretrained with the force_download option ?

LysandreJik on 11 Oct 2019

👍6 🎉1

That worked. Thanks!

amankedia on 14 Oct 2019

If you are using Window 10 machine, deleting vgg16-something in folder C:\Users\UserName\.cache\torch\checkpoints would solve probelm.

prasadheeramani on 15 Jan 2020

🎉2 👍1

Using force_download option also works for me.

iamxpy on 24 Jan 2020

Hi! It seems to me that the file that was downloaded was corrupted, probably because of lacking space or a network error. Could you try using the from_pretrained with the force_download option ?

where to use this in the code?

Using force_download option also works for me.

Hi! It seems to me that the file that was downloaded was corrupted, probably because of lacking space or a network error. Could you try using the from_pretrained with the force_download option ?

how or where to use this in my code

iitbombombay on 21 Apr 2020

Well, what's your code? from_pretrained should be the method you use to load models/configurations/tokenizers.

model = model_class.from_pretrained(pretrained_weights, force_download=True)

LysandreJik on 23 Apr 2020

I want to run mmdetection demo image_demo.py but has this problems
I use google colab pytorch 1.3.1 .
Traceback (most recent call last):
File "demo/image_demo.py", line 26, in
main()
File "demo/image_demo.py", line 18, in main
model = init_detector(args.config, args.checkpoint, device=args.device)
File "/content/mmdetection/mmdet/apis/inference.py", line 35, in init_detector
checkpoint = load_checkpoint(model, checkpoint)
File "/root/mmcv/mmcv/runner/checkpoint.py", line 224, in load_checkpoint
checkpoint = _load_checkpoint(filename, map_location)
File "/root/mmcv/mmcv/runner/checkpoint.py", line 200, in _load_checkpoint
checkpoint = torch.load(filename, map_location=map_location)
File "/content/anaconda3/lib/python3.7/site-packages/torch/serialization.py", line 426, in load
return _load(f, map_location, pickle_module, **pickle_load_args)
File "/content/anaconda3/lib/python3.7/site-packages/torch/serialization.py", line 620, in _load
deserialized_objects[key]._set_from_file(f, offset, f_should_read_directly)
RuntimeError: storage has wrong size: expected -4934180888905747925 got 64