Transformers: connection issue

Created on 20 Nov 2020 · 21Comments · Source: huggingface/transformers

Hi
I am runnig seq2seq_trainer on TPUs I am always getting this connection issue could you please have a look
sicne this is on TPUs this is hard for me to debug
thanks
Best
Rabeeh

    2389961.mean    (11/20/2020 05:24:09 PM)        (Detached)
local_files_only=local_files_only,

File "/anaconda3/envs/torch-xla-1.7/lib/python3.6/site-packages/transformers/file_utils.py", line 955, in cached_path
local_files_only=local_files_only,
File "/anaconda3/envs/torch-xla-1.7/lib/python3.6/site-packages/transformers/file_utils.py", line 1125, in get_from_cache
"Connection error, and we cannot find the requested files in the cached path."
ValueError: Connection error, and we cannot find the requested files in the cached path. Please try again or make sure your Internet connection is on.
Traceback (most recent call last):
File "/home/rabeeh//internship/seq2seq/xla_spawn.py", line 71, in
main()
XLA label: %copy.32724.remat = f32[80,12,128,128]{3,2,1,0:T(8,128)} copy(f32[80,12,128,128]{2,3,1,0:T(8,128)} %bitcast.576)
Allocation type: HLO temp
==========================

Size: 60.00M
Shape: f32[80,12,128,128]{3,2,1,0:T(8,128)}
Unpadded size: 60.00M
XLA label: %copy.32711.remat = f32[80,12,128,128]{3,2,1,0:T(8,128)} copy(f32[80,12,128,128]{2,3,1,0:T(8,128)
0%| | 2/18060 [08:12<1234:22:09, 246.08s/it]Traceback (most recent call last):
File "/home/rabeeh//internship/seq2seq/xla_spawn.py", line 71, in
main()
File "/home/rabeeh//internship/seq2seq/xla_spawn.py", line 67, in main
xmp.spawn(mod._mp_fn, args=(), nprocs=args.num_cores)
File "/anaconda3/envs/torch-xla-1.7/lib/python3.6/site-packages/torch_xla/distributed/xla_multiprocessing.py", line 395, in spawn
start_method=start_method)
File "/anaconda3/envs/torch-xla-1.7/lib/python3.6/site-packages/torch/multiprocessing/spawn.py", line 157, in start_processes
while not context.join():
File "/anaconda3/envs/torch-xla-1.7/lib/python3.6/site-packages/torch/multiprocessing/spawn.py", line 112, in join
(error_index, exitcode)

Source

rabeehk

👍2

Most helpful comment

Working on a fix, hopefully fixed for good today.

Meanwhile as a workaround please retry a couple minutes later should do the trick

julien-c on 25 Nov 2020

👍3

All 21 comments

Having a similar issue while running Multi class classification model

sumyuck on 23 Nov 2020

@patrickvonplaten @sumyuck @sgugger

rabeehk on 23 Nov 2020

Hi
I am constantly getting this erorr, looks like a bug to me since sometimes it appears sometimes not, could you please help me, this is expensive experiments I am trying on TPUs and I appreciate your help to fix it, it just many times fails due to this error

getting this erorr Exception in device=TPU:0: Connection error, and we cannot find the requested files in the cached path. Please try again or make sure your Internet connection is on.
el/0 I1124 07:19:52.663760 424494 main shadow.py:87 > Traceback (most recent call last):
File "/root/anaconda3/envs/pytorch/lib/python3.6/site-packages/torch_xla/distributed/xla_multiprocessing.py", line 330, in _mp_start_fn
_start_fn(index, pf_cfg, fn, args)
File "/root/anaconda3/envs/pytorch/lib/python3.6/site-packages/torch_xla/distributed/xla_multiprocessing.py", line 324, in _start_fn
fn(gindex, args)
File "/workdir/seq2seq/finetune_t5_trainer.py", line 230, in _mp_fn
main()
File "/workdir/seq2seq/finetune_t5_trainer.py", line 71, in main
cache_dir=model_args.cache_dir,
File "/root/anaconda3/envs/pytorch/lib/python3.6/site-packages/transformers/configuration_utils.py", line 347, in from_pretrained
config_dict, kwargs = cls.get_config_dict(pretrained_model_name_or_path, *kwargs)
File "/root/anaconda3/envs/pytorch/lib/python3.6/site-packages/transformers/configuration_utils.py", line 388, in get_config_dict
local_files_only=local_files_only,
File "/root/anaconda3/envs/pytorch/lib/python3.6/site-packages/transformers/file_utils.py", line 955, in cached_path
local_files_only=local_files_only,
File "/root/anaconda3/envs/pytorch/lib/python3.6/site-packages/transformers/file_utils.py", line 1125, in get_from_cache
"Connection error, and we cannot find the requested files in the cached path."

rabeehkarimimahabadi on 24 Nov 2020

@sumyuck

rabeehkarimimahabadi on 24 Nov 2020

@thomwolf

rabeehkarimimahabadi on 24 Nov 2020

this is with transformer 3.5.1, pytorch 1.6, on TPU v3-8, and I am using xla_spawn to launch the jobs, looks like a general issue with caching part.

rabeehkarimimahabadi on 24 Nov 2020

Same for me. Getting this error while trying to execute following line:
tokenizer = LxmertTokenizer.from_pretrained('unc-nlp/lxmert-base-uncased')

File "/Users/xxx/anaconda3/envs/test/lib/python3.7/site-packages/transformers/tokenization_utils_base.py", line 1629, in from_pretrained
local_files_only=local_files_only,
File "/Users/xxx/anaconda3/envs/test/lib/python3.7/site-packages/transformers/file_utils.py", line 955, in cached_path
local_files_only=local_files_only,
File "/Users/xxx/anaconda3/envs/test/lib/python3.7/site-packages/transformers/file_utils.py", line 1125, in get_from_cache
"Connection error, and we cannot find the requested files in the cached path."
ValueError: Connection error, and we cannot find the requested files in the cached path. Please try again or make sure your Internet connection is on.

alkeshpatel11 on 25 Nov 2020

to me this is not a connection issue. i do have connection but an issue in
caching mechanism.

On Wed, Nov 25, 2020, 2:33 AM Alkesh notifications@github.com wrote:

Same for me. Getting this error while trying to execute following line:
tokenizer = LxmertTokenizer.from_pretrained('unc-nlp/lxmert-base-uncased')

File
"/Users/xxx/anaconda3/envs/test/lib/python3.7/site-packages/transformers/tokenization_utils_base.py",
line 1629, in from_pretrained
local_files_only=local_files_only,
File
"/Users/xxx/anaconda3/envs/test/lib/python3.7/site-packages/transformers/file_utils.py",
line 955, in cached_path
local_files_only=local_files_only,
File
"/Users/xxx/anaconda3/envs/test/lib/python3.7/site-packages/transformers/file_utils.py",
line 1125, in get_from_cache
"Connection error, and we cannot find the requested files in the cached
path."
ValueError: Connection error, and we cannot find the requested files in
the cached path. Please try again or make sure your Internet connection is
on.

—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
https://github.com/huggingface/transformers/issues/8690#issuecomment-733405868,
or unsubscribe
https://github.com/notifications/unsubscribe-auth/ABP4ZCGOHVMHGA33EGSQ6UTSRRNGTANCNFSM4T5CBSUA
.

rabeehk on 25 Nov 2020

I am having the same issue too. I am pointing to the cache directory where pytorch is saving the models:
`cache_dir = '/home/me/.cache/torch/transformers/'

modelpath = "bert-base-uncased"

model = AutoModel.from_pretrained(modelpath, cache_dir=cache_dir)

tokenizer = AutoTokenizer.from_pretrained(modelpath, cache_dir=cache_dir)
`
And I am getting a connection error. pytorch: 1.7.0, transformers: 3.5.1.

nishkalavallabhi on 25 Nov 2020

Working on a fix, hopefully fixed for good today.

Meanwhile as a workaround please retry a couple minutes later should do the trick

julien-c on 25 Nov 2020

👍3

I deleted all cache, redownloaded all modes and ran again. It seems to be working as of now.

nishkalavallabhi on 25 Nov 2020

Scaling of connectivity for model hosting should be way improved now. Please comment here if you still experience connectivity issues from now on.

Thanks!

julien-c on 25 Nov 2020

👍1

I am still getting this error with transformers version - 3.5.1 and torch - 1.7.0 on python 3.6.9. Please check. I have tried deleting all cache, installing transformers using pip and source code both. But still getting the same issue again and again.

AshishDuhan on 26 Nov 2020

@AshishDuhan Are you loading a model in particular? Do you have a code snippet that consistently fails for you?

julien-c on 26 Nov 2020

_import torch
from transformers import PegasusForConditionalGeneration, PegasusTokenizer

src_text = [""""""]
model_name='google/pegasus-cnn_dailymail'
torch_device='cuda' if torch.cuda.is_available() else 'cpu'
tokenizer=PegasusTokenizer.from_pretrained(model_name)
model=PegasusForConditionalGeneration.from_pretrained(model_name).to(torch_device)
batch=tokenizer.prepare_seq2seq_batch(src_text, truncation=True, padding='longest').to(torch_device)
translated=model.generate(**batch)
tgt_text=tokenizer.batch_decode(translated, skip_special_tokens=True)
print('Summary:', tgt_text[0])_

This is one of the models I am trying to load. Although I have tried other models too and nothing works. Even the basic command fail with following error:

python -c "from transformers import pipeline; print(pipeline('sentiment-analysis')('we love you'))"
Traceback (most recent call last):
File "", line 1, in
File "/opt/app/jupyter/environments/env_summarization/lib/python3.6/site-packages/transformers/pipelines.py", line 2828, in pipeline
framework = framework or get_framework(model)
File "/opt/app/jupyter/environments/env_summarization/lib/python3.6/site-packages/transformers/pipelines.py", line 106, in get_framework
model = AutoModel.from_pretrained(model, revision=revision)
File "/opt/app/jupyter/environments/env_summarization/lib/python3.6/site-packages/transformers/modeling_auto.py", line 636, in from_pretrained
pretrained_model_name_or_path, return_unused_kwargs=True, *kwargs
File "/opt/app/jupyter/environments/env_summarization/lib/python3.6/site-packages/transformers/configuration_auto.py", line 333, in from_pretrained
config_dict, _ = PretrainedConfig.get_config_dict(pretrained_model_name_or_path, *kwargs)
File "/opt/app/jupyter/environments/env_summarization/lib/python3.6/site-packages/transformers/configuration_utils.py", line 388, in get_config_dict
local_files_only=local_files_only,
File "/opt/app/jupyter/environments/env_summarization/lib/python3.6/site-packages/transformers/file_utils.py", line 955, in cached_path
local_files_only=local_files_only,
File "/opt/app/jupyter/environments/env_summarization/lib/python3.6/site-packages/transformers/file_utils.py", line 1125, in get_from_cache
"Connection error, and we cannot find the requested files in the cached path."
ValueError: Connection error, and we cannot find the requested files in the cached path. Please try again or make sure your Internet connection is on.

AshishDuhan on 26 Nov 2020

Our connectivity has been good these past 24 hours so this might be a different (local) issue, @AshishDuhan.

Are you behind a proxy by any chance?

Does curl -i https://huggingface.co/google/pegasus-cnn_dailymail/resolve/main/config.json work from your machine? Can you try what you're doing from a machine in the cloud, like a Google Colab?

julien-c on 26 Nov 2020

I am facing the same issue still -

Traceback (most recent call last):
File "Untitled.py", line 59, in
tokenizer = AutoTokenizer.from_pretrained("emilyalsentzer/Bio_ClinicalBERT")
File "/project/6001557/akallada/digipath/lib/python3.7/site-packages/transformers/models/auto/tokenization_auto.py", line 310, in from_pretrained
config = AutoConfig.from_pretrained(pretrained_model_name_or_path, *kwargs)
File "/project/6001557/akallada/digipath/lib/python3.7/site-packages/transformers/models/auto/configuration_auto.py", line 341, in from_pretrained
config_dict, _ = PretrainedConfig.get_config_dict(pretrained_model_name_or_path, *kwargs)
File "/project/6001557/akallada/digipath/lib/python3.7/site-packages/transformers/configuration_utils.py", line 386, in get_config_dict
local_files_only=local_files_only,
File "/project/6001557/akallada/digipath/lib/python3.7/site-packages/transformers/file_utils.py", line 1007, in cached_path
local_files_only=local_files_only,
File "/project/6001557/akallada/digipath/lib/python3.7/site-packages/transformers/file_utils.py", line 1177, in get_from_cache
"Connection error, and we cannot find the requested files in the cached path."
ValueError: Connection error, and we cannot find the requested files in the cached path. Please try again or make sure your Internet connection is on.

AishwaryaAllada on 1 Dec 2020

I'm having the same connection issue. I've tried with and without passing my proxies into the BertModel

ValueError Traceback (most recent call last)
in
1 from transformers import BertTokenizer, BertModel
----> 2 model = BertModel.from_pretrained("bert-base-uncased", **proxies)

~/opt/anaconda3/envs/milglue/lib/python3.8/site-packages/transformers/modeling_utils.py in from_pretrained(cls, pretrained_model_name_or_path, model_args, *kwargs)
865 if not isinstance(config, PretrainedConfig):
866 config_path = config if config is not None else pretrained_model_name_or_path
--> 867 config, model_kwargs = cls.config_class.from_pretrained(
868 config_path,
869 *model_args,

~/opt/anaconda3/envs/milglue/lib/python3.8/site-packages/transformers/configuration_utils.py in from_pretrained(cls, pretrained_model_name_or_path, *kwargs)
345
346 """
--> 347 config_dict, kwargs = cls.get_config_dict(pretrained_model_name_or_path, *kwargs)
348 return cls.from_dict(config_dict, **kwargs)
349

~/opt/anaconda3/envs/milglue/lib/python3.8/site-packages/transformers/configuration_utils.py in get_config_dict(cls, pretrained_model_name_or_path, **kwargs)
380 try:
381 # Load from URL or cache if already cached
--> 382 resolved_config_file = cached_path(
383 config_file,
384 cache_dir=cache_dir,

~/opt/anaconda3/envs/milglue/lib/python3.8/site-packages/transformers/file_utils.py in cached_path(url_or_filename, cache_dir, force_download, proxies, resume_download, user_agent, extract_compressed_file, force_extract, local_files_only)
946 if is_remote_url(url_or_filename):
947 # URL, so get it from the cache (downloading if necessary)
--> 948 output_path = get_from_cache(
949 url_or_filename,
950 cache_dir=cache_dir,

~/opt/anaconda3/envs/milglue/lib/python3.8/site-packages/transformers/file_utils.py in get_from_cache(url, cache_dir, force_download, proxies, etag_timeout, resume_download, user_agent, local_files_only)
1122 )
1123 else:
-> 1124 raise ValueError(
1125 "Connection error, and we cannot find the requested files in the cached path."
1126 " Please try again or make sure your Internet connection is on."

ValueError: Connection error, and we cannot find the requested files in the cached path. Please try again or make sure your Internet connection is on.

cdo03c on 2 Dec 2020

Hard to say without seeing your full networking environment.

If you try to curl -I the URLs that you get on the arrow icons next to files in e.g. https://huggingface.co/bert-base-uncased/tree/main (or equivalent page for the model you try to download), what happens?

julien-c on 2 Dec 2020

it happened to me too , is there any fix on that ?

gokcesurenkok on 4 Dec 2020

is it transient or permanent (i.e. if you relaunch the command does it happen again)? You need to give us some more details if we want to help you troubleshoot.

julien-c on 4 Dec 2020

Was this page helpful?

0 / 5 - 0 ratings