Transformers: manually download models

Created on 22 Jul 2019 · 9Comments · Source: huggingface/transformers

ERROR:pytorch_transformers.modeling_utils:Couldn't reach server at 'https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-config.json' to download pretrained model configuration file.
ERROR:pytorch_transformers.modeling_utils:Couldn't reach server at 'https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-pytorch_model.bin' to download pretrained weights.
ERROR:pytorch_transformers.tokenization_utils:Couldn't reach server to download vocabulary.

how can I point to these 2 files if I manually download these two to some path?

wontfix

Source

Arvedek

👍5

Most helpful comment

If you don't want/cannot to use the built-in download/caching method, you can download both files manually, save them in a directory and rename them respectively config.json and pytorch_model.bin

Then you can load the model using model = BertModel.from_pretrained('path/to/your/directory')

thomwolf on 23 Jul 2019

👍25 🎉3

All 9 comments

If you don't want/cannot to use the built-in download/caching method, you can download both files manually, save them in a directory and rename them respectively config.json and pytorch_model.bin

Then you can load the model using model = BertModel.from_pretrained('path/to/your/directory')

thomwolf on 23 Jul 2019

👍25 🎉3

What if I try to run a GPT-2 example from docs Quickstart:

tokenizer = GPT2Tokenizer.from_pretrained('gpt2')
...
model = GPT2LMHeadModel.from_pretrained('gpt2')

and get this

INFO:pytorch_transformers.file_utils:https://s3.amazonaws.com/models.huggingface.co/bert/gpt2-vocab.json not found in cache, downloading to C:\Users\KHOVRI~1\AppData\Local\Temp\tmprm150emm
ERROR:pytorch_transformers.tokenization_utils:Couldn't reach server to download vocabulary.

Where should I put vocab file and get other files for GPT-2? I work under corporate proxy, maybe there is a way to write this proxy into the sort of config?

mikhovr on 30 Jul 2019

👍5

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

stale[bot] on 28 Sep 2019

错误：pytorch_transformers.modeling_utils：无法到达位于https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-config.json的服务器，无法下载经过预先训练的模型配置文件。
错误：pytorch_transformers.modeling_utils：无法到达位于“ https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-pytorch_model.bin ”的服务器以下载预先训练的权重。
错误：pytorch_transformers.tokenization_utils：无法访问服务器以下载词汇。

如果手动将这两个文件下载到某个路径，如何指向这两个文件？
I also encountered such a problem, the network speed is very slow, can not get off.However, I ran several times, about 10 times, and finally successfully ran without any error

baipianpian on 21 Nov 2019

Same question. Thank you.

guotong1988 on 9 May 2020

If you don't want/cannot to use the built-in download/caching method, you can download both files manually, save them in a directory and rename them respectively config.json and pytorch_model.bin

Then you can load the model using model = BertModel.from_pretrained('path/to/your/directory')

For posterity, those who get errors because of missing vocab.txt despite doing above, you can get it at https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-vocab.txt and also rename it to vocab.txt in desired folder. Resolved my errors.

swayson on 8 Jun 2020

👍1

Hi swayson,

model = BertModel.from_pretrained('path/to/your/directory')

where we need to add above line of code for loading model?

SagarPalyal on 3 Jul 2020

You can find all the models here https://stackoverflow.com/a/64280935/251674

alvesman on 9 Oct 2020

If you don't want/cannot to use the built-in download/caching method, you can download both files manually, save them in a directory and rename them respectively config.json and pytorch_model.bin

Then you can load the model using model = BertModel.from_pretrained('path/to/your/directory')

so great!