Transformers: Upload models using transformers-cli fails

Created on 30 Sep 2020  Â·  12Comments  Â·  Source: huggingface/transformers

Environment info

  • transformers version: 3.0.2
  • Platform: Linux-4.15.0-112-generic-x86_64-with-glibc2.10
  • Python version: 3.8.5
  • PyTorch version (GPU?): 1.6.0 (False)
  • Tensorflow version (GPU?): 2.3.0 (False)
  • Using GPU in script?: No
  • Using distributed or parallel set-up in script?: No

Who can help

Model Cards: @julien-c
T5: @patrickvonplaten

Information

Model I am using T5:

The problem arises when using:

  • [X] the official example scripts: (give details below)
  • [ ] my own modified scripts: (give details below)

The tasks I am working on is:

  • [ ] an official GLUE/SQUaD task: (give the name)
  • [X] my own task or dataset: (give details below)

To reproduce

Steps to reproduce the behavior:

Command:
transformers-cli upload ./prot_t5_xl_bfd/ --organization Rostlab

Error:

About to upload file /mnt/lsf-nas-1/lsf/job/repo/elnaggar/prot-transformers/models/transformers/prot_t5_xl_bfd/pytorch_model.bin to S3 under filename prot_t5_xl_bfd/pytorch_model.bin and namespace Rostl
ab                                                                                                                                                                                                        
Proceed? [Y/n] y                                                                                                                                                                                          
Uploading... This might take a while if files are large                                                                                                                                                   
  0%|▌                                                                                                                                               | 48242688/11276091454 [00:02<14:55, 12534308.31it/s]
Traceback (most recent call last):                                                                                                                                                                        
  File "/mnt/lsf-nas-1/lsf/job/repo/elnaggar/anaconda3/envs/transformers_covid/lib/python3.8/site-packages/urllib3/connectionpool.py", line 670, in urlopen                                               
    httplib_response = self._make_request(                                                                                                                                                                
  File "/mnt/lsf-nas-1/lsf/job/repo/elnaggar/anaconda3/envs/transformers_covid/lib/python3.8/site-packages/urllib3/connectionpool.py", line 392, in _make_request                                         
    conn.request(method, url, **httplib_request_kw)                                                                                                                                                       
  File "/mnt/lsf-nas-1/lsf/job/repo/elnaggar/anaconda3/envs/transformers_covid/lib/python3.8/http/client.py", line 1255, in request                                                                       
    self._send_request(method, url, body, headers, encode_chunked)                                                                                                                                        
  File "/mnt/lsf-nas-1/lsf/job/repo/elnaggar/anaconda3/envs/transformers_covid/lib/python3.8/http/client.py", line 1301, in _send_request                                                                 
    self.endheaders(body, encode_chunked=encode_chunked)                                                                                                                                                  
  File "/mnt/lsf-nas-1/lsf/job/repo/elnaggar/anaconda3/envs/transformers_covid/lib/python3.8/http/client.py", line 1250, in endheaders                                                                    
    self._send_output(message_body, encode_chunked=encode_chunked)                                                                                                                                        
  File "/mnt/lsf-nas-1/lsf/job/repo/elnaggar/anaconda3/envs/transformers_covid/lib/python3.8/http/client.py", line 1049, in _send_output                                                                  
    self.send(chunk)                                                                                                                                                                                      
  File "/mnt/lsf-nas-1/lsf/job/repo/elnaggar/anaconda3/envs/transformers_covid/lib/python3.8/http/client.py", line 971, in send                                                                           
    self.sock.sendall(data)                                                                                                                                                                               
  File "/mnt/lsf-nas-1/lsf/job/repo/elnaggar/anaconda3/envs/transformers_covid/lib/python3.8/ssl.py", line 1204, in sendall                                                                               
    v = self.send(byte_view[count:])                                                                                                                                                                      
  File "/mnt/lsf-nas-1/lsf/job/repo/elnaggar/anaconda3/envs/transformers_covid/lib/python3.8/ssl.py", line 1173, in send                                                                                  
    return self._sslobj.write(data)                                                                                                                                                                       
BrokenPipeError: [Errno 32] Broken pipe        

Traceback (most recent call last):                                                                                                                                                                        
  File "/mnt/lsf-nas-1/lsf/job/repo/elnaggar/anaconda3/envs/transformers_covid/lib/python3.8/site-packages/requests/adapters.py", line 439, in send                                                       
    resp = conn.urlopen(                                                                                                                                                                                  
  File "/mnt/lsf-nas-1/lsf/job/repo/elnaggar/anaconda3/envs/transformers_covid/lib/python3.8/site-packages/urllib3/connectionpool.py", line 726, in urlopen                                               
    retries = retries.increment(                                                                                                                                                                          
  File "/mnt/lsf-nas-1/lsf/job/repo/elnaggar/anaconda3/envs/transformers_covid/lib/python3.8/site-packages/urllib3/util/retry.py", line 403, in increment                                                 
    raise six.reraise(type(error), error, _stacktrace)                                                                                                                                                    
  File "/mnt/lsf-nas-1/lsf/job/repo/elnaggar/anaconda3/envs/transformers_covid/lib/python3.8/site-packages/urllib3/packages/six.py", line 734, in reraise                                                 
    raise value.with_traceback(tb)                                                                                                                                                                        
  File "/mnt/lsf-nas-1/lsf/job/repo/elnaggar/anaconda3/envs/transformers_covid/lib/python3.8/site-packages/urllib3/connectionpool.py", line 670, in urlopen                                               
    httplib_response = self._make_request(                                                                                                                                                                
  File "/mnt/lsf-nas-1/lsf/job/repo/elnaggar/anaconda3/envs/transformers_covid/lib/python3.8/site-packages/urllib3/connectionpool.py", line 392, in _make_request                                         
    conn.request(method, url, **httplib_request_kw)                                                                                                                                                       
  File "/mnt/lsf-nas-1/lsf/job/repo/elnaggar/anaconda3/envs/transformers_covid/lib/python3.8/http/client.py", line 1255, in request                                                                       
    self._send_request(method, url, body, headers, encode_chunked)                                                                                                                                        
  File "/mnt/lsf-nas-1/lsf/job/repo/elnaggar/anaconda3/envs/transformers_covid/lib/python3.8/http/client.py", line 1301, in _send_request                                                                 
    self.endheaders(body, encode_chunked=encode_chunked)                                                                                                                                                  
  File "/mnt/lsf-nas-1/lsf/job/repo/elnaggar/anaconda3/envs/transformers_covid/lib/python3.8/http/client.py", line 1250, in endheaders
    self._send_output(message_body, encode_chunked=encode_chunked)
  File "/mnt/lsf-nas-1/lsf/job/repo/elnaggar/anaconda3/envs/transformers_covid/lib/python3.8/http/client.py", line 1049, in _send_output
    self.send(chunk)
  File "/mnt/lsf-nas-1/lsf/job/repo/elnaggar/anaconda3/envs/transformers_covid/lib/python3.8/http/client.py", line 971, in send
    self.sock.sendall(data)
  File "/mnt/lsf-nas-1/lsf/job/repo/elnaggar/anaconda3/envs/transformers_covid/lib/python3.8/ssl.py", line 1204, in sendall
    v = self.send(byte_view[count:])
  File "/mnt/lsf-nas-1/lsf/job/repo/elnaggar/anaconda3/envs/transformers_covid/lib/python3.8/ssl.py", line 1173, in send
    return self._sslobj.write(data)
urllib3.exceptions.ProtocolError: ('Connection aborted.', BrokenPipeError(32, 'Broken pipe'))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/mnt/lsf-nas-1/lsf/job/repo/elnaggar/anaconda3/envs/transformers_covid/bin/transformers-cli", line 8, in <module>
    sys.exit(main())
  File "/mnt/lsf-nas-1/lsf/job/repo/elnaggar/anaconda3/envs/transformers_covid/lib/python3.8/site-packages/transformers/commands/transformers_cli.py", line 33, in main
    service.run()
  File "/mnt/lsf-nas-1/lsf/job/repo/elnaggar/anaconda3/envs/transformers_covid/lib/python3.8/site-packages/transformers/commands/user.py", line 232, in run
    access_url = self._api.presign_and_upload(
  File "/mnt/lsf-nas-1/lsf/job/repo/elnaggar/anaconda3/envs/transformers_covid/lib/python3.8/site-packages/transformers/hf_api.py", line 167, in presign_and_upload
    r = requests.put(urls.write, data=data, headers={"content-type": urls.type})
  File "/mnt/lsf-nas-1/lsf/job/repo/elnaggar/anaconda3/envs/transformers_covid/lib/python3.8/site-packages/requests/api.py", line 134, in put
    return request('put', url, data=data, **kwargs)
  File "/mnt/lsf-nas-1/lsf/job/repo/elnaggar/anaconda3/envs/transformers_covid/lib/python3.8/site-packages/requests/api.py", line 61, in request
    return session.request(method=method, url=url, **kwargs)
  File "/mnt/lsf-nas-1/lsf/job/repo/elnaggar/anaconda3/envs/transformers_covid/lib/python3.8/site-packages/requests/sessions.py", line 530, in request
    resp = self.send(prep, **send_kwargs)
  File "/mnt/lsf-nas-1/lsf/job/repo/elnaggar/anaconda3/envs/transformers_covid/lib/python3.8/site-packages/requests/sessions.py", line 643, in send
    r = adapter.send(request, **kwargs)
  File "/mnt/lsf-nas-1/lsf/job/repo/elnaggar/anaconda3/envs/transformers_covid/lib/python3.8/site-packages/requests/adapters.py", line 498, in send
    raise ConnectionError(err, request=request)
requests.exceptions.ConnectionError: ('Connection aborted.', BrokenPipeError(32, 'Broken pipe'))                                                                                                                                                          

Expected behavior

I am trying to upload our T5-3B model using transformers-cli, but it always fails and gives "BrokenPipeError".
It only uploads small files like configuration files but it fails for the model files.
I have tried two different machines and both of them gives the same error.

All 12 comments

Yes this is a known issue with our current system that will be fixed in ~1 month.

In the meantime, if you can upload to a different S3 bucket I can cp the files to your account on ours. Would you be able to do this?

I don't have access to S3. However, I uploaded the model in my dropbox:
https://www.dropbox.com/sh/0e7weo5l6g1uvqi/AADBZN_vuawdR3YOUOzZRo8Pa?dl=0

Is it possible to download and upload it from the dropbox folder?

Super I'll take care of it!

Perfect, thanks a lot @patrickvonplaten for your help.
This solves my issue 😄

I will test the model to make sure everything is working as expected.

Should we close this issue as it solved my current problem, or should we leave it open until the "transformers-cli" uploading problem is solved?

I will leave it to you.

Let's leave it open :-)

Hi! I'm having an issue uploading a model as well. I've tried several different iterations of the CLI command to get it to work. I'm following the instructions from the model sharing docs.

Here's the info about my setup:

  • transformers version: 3.3.1
  • Platform: Ubuntu (it's a Google Cloud Platform VM)
  • Python version: 3.8.5
  • PyTorch version (GPU?): 1.4.0 (True)
  • Tensorflow version (GPU?): 2.3.1 (True)
  • Using GPU in script?: No
  • Using distributed or parallel set-up in script?: No

First, I tried transformers-cli upload distilbert-for-food-extraction, as it says to do in the docs. This fails because for some reason the directory is not found, even though ls distilbert-for-food-extraction confirms that the directory and its files exist in this location.

(hf-nlp) charlenechambliss@charlene-gpu:~/.cache/food-ner/models$ transformers-cli upload chambliss/distilbert-for-food-extraction
2020-10-10 21:43:16.899194: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudart.so.10.1
Traceback (most recent call last):
  File "/home/charlenechambliss/anaconda3/envs/hf-nlp/bin/transformers-cli", line 8, in <module>
    sys.exit(main())
  File "/home/charlenechambliss/anaconda3/envs/hf-nlp/lib/python3.8/site-packages/transformers/commands/transformers_cli.py", line 33, in main
    service.run()
  File "/home/charlenechambliss/anaconda3/envs/hf-nlp/lib/python3.8/site-packages/transformers/commands/user.py", line 197, in run
    files = self.walk_dir(rel_path)
  File "/home/charlenechambliss/anaconda3/envs/hf-nlp/lib/python3.8/site-packages/transformers/commands/user.py", line 180, in walk_dir
    entries: List[os.DirEntry] = list(os.scandir(rel_path))
FileNotFoundError: [Errno 2] No such file or directory: 'distilbert-for-food-extraction'

Then I tried nesting it under a directory matching my HuggingFace username, so now the path is chambliss/distilbert-for-food-extraction. Attempting the upload again seems to result in 3 out of 6 files being uploaded, then the process is aborted. Here is the full output I'm getting:

(hf-nlp) charlenechambliss@charlene-gpu:~/.cache/food-ner/models$ transformers-cli upload chambliss
2020-10-10 21:43:28.932647: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudart.so.10.1
About to upload file /home/charlenechambliss/.cache/food-ner/models/chambliss/distilbert-for-food-extraction/special_tokens_map.json to S3 under filename chambliss/distilbert-for-food-extraction/special_tokens_map.json and namespace chambliss
About to upload file /home/charlenechambliss/.cache/food-ner/models/chambliss/distilbert-for-food-extraction/vocab.txt to S3 under filename chambliss/distilbert-for-food-extraction/vocab.txt and namespace chambliss
About to upload file /home/charlenechambliss/.cache/food-ner/models/chambliss/distilbert-for-food-extraction/pytorch_model.bin to S3 under filename chambliss/distilbert-for-food-extraction/pytorch_model.bin and namespace chambliss
About to upload file /home/charlenechambliss/.cache/food-ner/models/chambliss/distilbert-for-food-extraction/config.json to S3 under filename chambliss/distilbert-for-food-extraction/config.json and namespace chambliss
About to upload file /home/charlenechambliss/.cache/food-ner/models/chambliss/distilbert-for-food-extraction/tokenizer_config.json to S3 under filename chambliss/distilbert-for-food-extraction/tokenizer_config.json and namespace chambliss
About to upload file /home/charlenechambliss/.cache/food-ner/models/chambliss/distilbert-for-food-extraction/tf_model.h5 to S3 under filename chambliss/distilbert-for-food-extraction/tf_model.h5 and namespace chambliss
Proceed? [Y/n] Y
Uploading... This might take a while if files are large
Your file now lives at:                                                                                       
https://s3.amazonaws.com/models.huggingface.co/bert/chambliss/chambliss/distilbert-for-food-extraction/special_tokens_map.json
Your file now lives at:                                                                                       
https://s3.amazonaws.com/models.huggingface.co/bert/chambliss/chambliss/distilbert-for-food-extraction/vocab.txt
Your file now lives at:                                                                                       
https://s3.amazonaws.com/models.huggingface.co/bert/chambliss/chambliss/distilbert-for-food-extraction/pytorch_model.bin
400 Client Error: Bad Request for url: https://huggingface.co/api/presign
Filename invalid, model must be at exactly one level of nesting, i.e. "user/model_name".

If there is not a fix available for this at the moment, would it be possible to have my model uploaded via Dropbox as well?

Thanks!
Charlene

Hey @chambliss - it looks like you are uploading the wrong folder. Instead of running

~/.cache/food-ner/models$ transformers-cli upload chambliss

you should run

~/.cache/food-ner/models/chambliss$ transformers-cli upload distilbert-for-food-extraction

I think

I'll second that. If ls distilbert-for-food-extraction works and shows the correct files, transformers-cli upload distilbert-for-food-extraction should work and would be able to find the correct directory.

@patrickvonplaten @julien-c Thanks for the response guys! I'm not sure why the directory wasn't found the first time, but I tried it again just now (from inside the /chambliss directory, so ~/.cache/food-ner/models/chambliss$ transformers-cli upload distilbert-for-food-extraction, as suggested) and it worked.

As a user, it is a little confusing for a reference to the correct directory not to work, and to have to be exactly one level above the directory in order for the upload to succeed. The example given on the page (transformers-cli upload path/to/awesome-name-you-picked/) implies that you can do the upload from anywhere relative to the folder. If that is a constraint, it may be worth updating the docs to reflect it.

Thanks again for the help!

no, it is indeed supposed to work as you describe, specifying the dir from any point in your filesystem.

Let us know if that's not the case.

Will reopen this for clarity until the fix mentioned in https://github.com/huggingface/transformers/issues/8480#issuecomment-726731046 is deployed

Was this page helpful?
0 / 5 - 0 ratings

Related issues

quocnle picture quocnle  Â·  3Comments

0x01h picture 0x01h  Â·  3Comments

rsanjaykamath picture rsanjaykamath  Â·  3Comments

adigoryl picture adigoryl  Â·  3Comments

alphanlp picture alphanlp  Â·  3Comments