Transformers: Can't load t5-11b from pre-trained

Created on 16 Aug 2020  路  3Comments  路  Source: huggingface/transformers

Environment info

  • transformers version: 3.0.2
  • Platform:
  • Python version: 3.8.2
  • PyTorch version 1.6

Who can help

T5: @patrickvonplaten

Information

The model I am using: T5

To reproduce

Steps to reproduce the behavior:

import transformers
transformers.T5ForConditionalGeneration.from_pretrained("t5-11b")
OSError: Can't load weights for 't5-11b'. Make sure that:

- 't5-11b' is a correct model identifier listed on 'https://huggingface.co/models'

- or 't5-11b' is the correct path to a directory containing a file named one of pytorch_model.bin, tf_model.h5, model.ckpt.

Expected behavior

the model should be loaded.

Most helpful comment

Thanks @patrickvonplaten ,
Our work successfully adds (several types of) model parallellism and trains T5 and several other large transformers and is integrated with HF for quite a while.

Will opensource it soon :)

All 3 comments

Hey @saareliad,
can you try:

t5 = transformers.T5ForConditionalGeneration.from_pretrained('t5-11b', use_cdn = False)

Also, see: https://github.com/huggingface/transformers/issues/5423

But the model cannot really be run before we take a closer look at: https://github.com/huggingface/transformers/pull/3578.

@patrickvonplaten mind adding a big disclaimer to the model card for this particular checkpoint? About what you just said (CDN limitation + model parallelism)

Thanks @patrickvonplaten ,
Our work successfully adds (several types of) model parallellism and trains T5 and several other large transformers and is integrated with HF for quite a while.

Will opensource it soon :)

Was this page helpful?
0 / 5 - 0 ratings