Transformers: Can't load pegasus models.

Created on 16 Aug 2020  路  10Comments  路  Source: huggingface/transformers

Hi,

I've tried uploading Peagsus from PreTrainedModel and PreTrainedTokenizer but run into KeyError, I have transformers 3.0.2 - any idea why that might be happening?

Most helpful comment

Ok I solved my issue.
FYI: The problem was that I saved the loaded model in the directory google/pegesus-pubmed once in an invalid way and from now on the from_pretrained method tried to load it from the local path first which did not work. Sorry for bothering you!

All 10 comments

Not sure what you mean here. Can you please post the traceback and the code that resulted in the error ?

This is what I got:

KeyError                                  Traceback (most recent call last)

<ipython-input-18-eb1fb8795ed4> in <module>()
      2 from transformers import AutoTokenizer, AutoModelWithLMHead
      3 
----> 4 tokenizer = AutoTokenizer.from_pretrained("google/pegasus-multi_news")
      5 
      6 cla = AutoModelWithLMHead.from_pretrained("google/pegasus-multi_news")

1 frames

/usr/local/lib/python3.6/dist-packages/transformers/tokenization_auto.py in from_pretrained(cls, pretrained_model_name_or_path, *inputs, **kwargs)
    204         config = kwargs.pop("config", None)
    205         if not isinstance(config, PretrainedConfig):
--> 206             config = AutoConfig.from_pretrained(pretrained_model_name_or_path, **kwargs)
    207 
    208         if "bert-base-japanese" in str(pretrained_model_name_or_path):

/usr/local/lib/python3.6/dist-packages/transformers/configuration_auto.py in from_pretrained(cls, pretrained_model_name_or_path, **kwargs)
    204 
    205         if "model_type" in config_dict:
--> 206             config_class = CONFIG_MAPPING[config_dict["model_type"]]
    207             return config_class.from_dict(config_dict, **kwargs)
    208         else:

KeyError: 'pegasus'

Yep, I've got this too.
image

@HenryDashwood , @Kejia I can load both of these models on master branch.

What version of transformers are you using, try doing this with master as it's not available in 3.0.2 release.

pip install -U git+https://github.com/huggingface/transformers.git

Ah of course. Cheers!

@patil-suraj I tried installing the master version using the shared URL but still it wasn't updated to master version.
Can you share more details for installing Transformers Master version?

Sorry, typo. should be -U and not -u

pip install -U git+https://github.com/huggingface/transformers.git

I can't load the PegasusTokenizer for the checkpoint google/pegasus-pubmed:

tokenizer = PegasusTokenizer.from_pretrained("google/pegasus-pubmed")

Error:

Traceback (most recent call last):
  File "src/download_model.py", line 17, in <module>
    tokenizer = PegasusTokenizer.from_pretrained(config['model_name'])
  File "/home/rafael/miniconda3/envs/torch/lib/python3.8/site-packages/transformers/tokenization_utils_base.py", line 1584, in from_pretrained
    raise EnvironmentError(
OSError: Model name 'google/pegasus-pubmed' was not found in tokenizers model name list (google/pegasus-xsum). We assumed 'google/pegasus-pubmed' was a path, a model identifier, or url to a directory containing vocabulary files named ['spiece.model'] but couldn't find such vocabulary files at this path or url.

When using the AutoTokenizer:

tokenizer = AutoTokenizer.from_pretrained("google/pegasus-pubmed")
Traceback (most recent call last):
  File "/home/rafael/miniconda3/envs/torch/lib/python3.8/site-packages/transformers/configuration_utils.py", line 368, in get_config_dict
    resolved_config_file = cached_path(
  File "/home/rafael/miniconda3/envs/torch/lib/python3.8/site-packages/transformers/file_utils.py", line 957, in cached_path
    raise EnvironmentError("file {} not found".format(url_or_filename))
OSError: file google/pegasus-pubmed/config.json not found

ping @sshleifer

I just installed it from master, still not working for me.

Environment Info:

  • transformers version: 3.4.0
  • Platform: Linux-5.4.0-51-generic-x86_64-with-glibc2.10
  • Python version: 3.8.5
  • PyTorch version (GPU?): 1.6.0 (True)
  • Tensorflow version (GPU?): not installed (NA)
  • Using GPU in script?: no
  • Using distributed or parallel set-up in script?: no

Can't replicate on master. Please post transformers-cli env info when having download issues, and try solutions above in the thread before posting.

Ok I solved my issue.
FYI: The problem was that I saved the loaded model in the directory google/pegesus-pubmed once in an invalid way and from now on the from_pretrained method tried to load it from the local path first which did not work. Sorry for bothering you!

Was this page helpful?
0 / 5 - 0 ratings

Related issues

chuanmingliu picture chuanmingliu  路  3Comments

adigoryl picture adigoryl  路  3Comments

lemonhu picture lemonhu  路  3Comments

lcswillems picture lcswillems  路  3Comments

iedmrc picture iedmrc  路  3Comments