Transformers: manually download models

Created on 22 Jul 2019  ·  9Comments  ·  Source: huggingface/transformers

ERROR:pytorch_transformers.modeling_utils:Couldn't reach server at 'https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-config.json' to download pretrained model configuration file.
ERROR:pytorch_transformers.modeling_utils:Couldn't reach server at 'https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-pytorch_model.bin' to download pretrained weights.
ERROR:pytorch_transformers.tokenization_utils:Couldn't reach server to download vocabulary.

how can I point to these 2 files if I manually download these two to some path?

wontfix

Most helpful comment

If you don't want/cannot to use the built-in download/caching method, you can download both files manually, save them in a directory and rename them respectively config.json and pytorch_model.bin

Then you can load the model using model = BertModel.from_pretrained('path/to/your/directory')

All 9 comments

If you don't want/cannot to use the built-in download/caching method, you can download both files manually, save them in a directory and rename them respectively config.json and pytorch_model.bin

Then you can load the model using model = BertModel.from_pretrained('path/to/your/directory')

What if I try to run a GPT-2 example from docs Quickstart:

tokenizer = GPT2Tokenizer.from_pretrained('gpt2')
...
model = GPT2LMHeadModel.from_pretrained('gpt2')

and get this

INFO:pytorch_transformers.file_utils:https://s3.amazonaws.com/models.huggingface.co/bert/gpt2-vocab.json not found in cache, downloading to C:\Users\KHOVRI~1\AppData\Local\Temp\tmprm150emm
ERROR:pytorch_transformers.tokenization_utils:Couldn't reach server to download vocabulary.

Where should I put vocab file and get other files for GPT-2? I work under corporate proxy, maybe there is a way to write this proxy into the sort of config?

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

错误:pytorch_transformers.modeling_utils:无法到达位于https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-config.json的服务器,无法下载经过预先​​训练的模型配置文件。
错误:pytorch_transformers.modeling_utils:无法到达位于“ https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-pytorch_model.bin ”的服务器以下载预先训练的权重。
错误:pytorch_transformers.tokenization_utils:无法访问服务器以下载词汇。

如果手动将这两个文件下载到某个路径,如何指向这两个文件?
I also encountered such a problem, the network speed is very slow, can not get off.However, I ran several times, about 10 times, and finally successfully ran without any error

Same question. Thank you.

If you don't want/cannot to use the built-in download/caching method, you can download both files manually, save them in a directory and rename them respectively config.json and pytorch_model.bin

Then you can load the model using model = BertModel.from_pretrained('path/to/your/directory')

For posterity, those who get errors because of missing vocab.txt despite doing above, you can get it at https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-vocab.txt and also rename it to vocab.txt in desired folder. Resolved my errors.

Hi swayson,

model = BertModel.from_pretrained('path/to/your/directory')

where we need to add above line of code for loading model?

You can find all the models here https://stackoverflow.com/a/64280935/251674

If you don't want/cannot to use the built-in download/caching method, you can download both files manually, save them in a directory and rename them respectively config.json and pytorch_model.bin

Then you can load the model using model = BertModel.from_pretrained('path/to/your/directory')

so great!

Was this page helpful?
0 / 5 - 0 ratings