Model I am using (Bert, XLNet....): bert-base-cased-finetuned-conll03-english
Language I am using the model on (English, Chinese....): English
The problem arise when using:
The tasks I am working on is:
Steps to reproduce the behavior:
I'm following the instructions at https://huggingface.co/bert-large-cased-finetuned-conll03-english but failing at the first hurdle. This is the snippet from the docs that I've run:
tokenizer = AutoTokenizer.from_pretrained("bert-large-cased-finetuned-conll03-english")
model = AutoModel.from_pretrained("bert-large-cased-finetuned-conll03-english")
It fails with this message:
OSError: Model name 'bert-base-cased-finetuned-conll03-english' was not found in model name list (bert-base-uncased, bert-large-uncased, bert-base-cased, bert-large-cased, bert-base-multilingual-uncased, bert-base-multilingual-cased, bert-base-chinese, bert-base-german-cased, bert-large-uncased-whole-word-masking, bert-large-cased-whole-word-masking, bert-large-uncased-whole-word-masking-finetuned-squad, bert-large-cased-whole-word-masking-finetuned-squad, bert-base-cased-finetuned-mrpc, bert-base-german-dbmdz-cased, bert-base-german-dbmdz-uncased, bert-base-japanese, bert-base-japanese-whole-word-masking, bert-base-japanese-char, bert-base-japanese-char-whole-word-masking, bert-base-finnish-cased-v1, bert-base-finnish-uncased-v1). We assumed 'https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-cased-finetuned-conll03-english/config.json' was a path or url to a configuration file named config.json or a directory containing such a file but couldn't find any such file at this path or url.
The message mentions looking at https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-cased-finetuned-conll03-english/config.json and finding nothing.
I also tried with the CLI: transformers-cli download bert-base-cased-finetuned-conll03-english but I'm afraid that failed with a similar message. However both methods work for the namespaced models, e.g. dbmdz/bert-base-italian-cased.
The community model should download. :)
I browsed https://s3.amazonaws.com/models.huggingface.co/ and see that the model is there, but paths are like:
rather than:
(note -config.json vs /config.json)
If I download the files manually and rename, the model loads. So it looks like just a naming problem.
I confirm what you see... in current master code, bert-large-cased-finetuned-conll03-english has no mapping in tokenizers or models so it can't find it in the same way as bert-base-uncased for example.
but it works if you target it directly:
AutoTokenizer.from_pretrained("https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-cased-finetuned-conll03-english-config.json")
AutoModel.from_pretrained("https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-cased-finetuned-conll03-english-pytorch_model.bin")
Hmm, I think I see the issue. @stefan-it @mfuntowicz we could either:
bert-large-cased-finetuned-conll03-english to dbmdz/bert-large-cased-finetuned-conll03-englishWhat do you think?
(also kinda related to #2281)
@julien-c I think it would be better to move the model under the dbmdz namespace - as it is no "official" model!
@julien-c moving to dbmdz is fine. We need to update the default NER pipeline's model provider to reflect the new path.
Model now lives at https://huggingface.co/dbmdz/bert-large-cased-finetuned-conll03-english
Let me know if everything works correctly!
Works perfectly now, thanks!