Transformers: An Error report about pipeline

Created on 11 Mar 2020 · 8Comments · Source: huggingface/transformers

🐛 Bug

Information

This may be an easy question, but it has been bothering me all day.

When I run the code:
nlp = pipeline("question-answering")

It always tells me:
Couldn't reach server at 'https://s3.amazonaws.com/models.huggingface.co/bert/distilbert-base-cased-distilled-squad-modelcard.json' to download model card file.
Creating an empty model card.

If I ignore it and continue to run the rest of the code:
nlp({
'question': 'What is the name of the repository ?',
'context': 'Pipeline have been included in the huggingface/transformers repository'
})

The error will appear:
KeyError: 'token_type_ids'

Pipeline Version mismatch

Source

SizhaoXu

Most helpful comment

use :
pip install transformers==2.5.1
instead of :
pip install transformers

paras55 on 27 Mar 2020

👍3

All 8 comments

I have this same issue, but have no problems running:

nlp = pipeline("question-answering")

Note: To install the library, I had to install tokenizers version 0.6.0 separately, git clone the transformers repo and edit the setup.py file before installing as per @dafraile's answer for issue: https://github.com/huggingface/transformers/issues/2831

Update: This error was fixed when I installed tokenizers==0.5.2

EllieRoseS on 11 Mar 2020

👍1

I sadly have this issue too with the newest transformers 2.6.0 version.

Tokenizers is at version 0.5.2. But newest version of tokenizers sadly also doesn't work.

And solutions to fix this issue?

nreimers on 25 Mar 2020

I have the same issue here. I first ran with my own tokenizer, but it failed, and then I tried to run the 03-pipelines.ipynb code with QnA example and I get the following error code.

Environment:
tensorflow==2.0.0
tensorflow-estimator==2.0.1
tensorflow-gpu==2.0.0
torch==1.4.0
transformers==2.5.1
tokenizers==0.6.0

Code that I ran:
nlp_qa = pipeline('question-answering')
nlp_qa(context='Hugging Face is a French company based in New-York.', question='Where is based Hugging Face ?')

Error code:

HBox(children=(FloatProgress(value=0.0, description='Downloading', max=230.0, style=ProgressStyle(description_…

convert squad examples to features: 0%| | 0/1 [00:00
RemoteTraceback Traceback (most recent call last)
RemoteTraceback:
"""
Traceback (most recent call last):
File "/home/brandon/anaconda3/envs/transformers/lib/python3.7/multiprocessing/pool.py", line 121, in worker
result = (True, func(args, kwds))
File "/home/brandon/anaconda3/envs/transformers/lib/python3.7/multiprocessing/pool.py", line 44, in mapstar
return list(map(args))
File "/home/brandon/anaconda3/envs/transformers/lib/python3.7/site-packages/transformers/data/processors/squad.py", line 198, in squad_convert_example_to_features
p_mask = np.array(span["token_type_ids"])
KeyError: 'token_type_ids'
"""

The above exception was the direct cause of the following exception:

KeyError Traceback (most recent call last)
in ()
1 nlp_qa = pipeline('question-answering')
----> 2 nlp_qa(context='Hugging Face is a French company based in New-York.', question='Where is based Hugging Face ?')

~/anaconda3/envs/transformers/lib/python3.7/site-packages/transformers/pipelines.py in call(self, texts, *kwargs)
968 False,
969 )
--> 970 for example in examples
971 ]
972 all_answers = []

~/anaconda3/envs/transformers/lib/python3.7/site-packages/transformers/pipelines.py in (.0)
968 False,
969 )
--> 970 for example in examples
971 ]
972 all_answers = []

~/anaconda3/envs/transformers/lib/python3.7/site-packages/transformers/data/processors/squad.py in squad_convert_examples_to_features(examples, tokenizer, max_seq_length, doc_stride, max_query_length, is_training, return_dataset, threads)
314 p.imap(annotate_, examples, chunksize=32),
315 total=len(examples),
--> 316 desc="convert squad examples to features",
317 )
318 )

~/anaconda3/envs/transformers/lib/python3.7/site-packages/tqdm/std.py in iter(self)
1106 fp_write=getattr(self.fp, 'write', sys.stderr.write))
1107
-> 1108 for obj in iterable:
1109 yield obj
1110 # Update and possibly print the progressbar.

~/anaconda3/envs/transformers/lib/python3.7/multiprocessing/pool.py in (.0)
323 result._set_length
324 ))
--> 325 return (item for chunk in result for item in chunk)
326
327 def imap_unordered(self, func, iterable, chunksize=1):

~/anaconda3/envs/transformers/lib/python3.7/multiprocessing/pool.py in next(self, timeout)
746 if success:
747 return value
--> 748 raise value
749
750 next = next # XXX

KeyError: 'token_type_ids'

maximuslee1226 on 27 Mar 2020

Any help would be greatly appreciated!

maximuslee1226 on 27 Mar 2020

use :
pip install transformers==2.5.1
instead of :
pip install transformers

paras55 on 27 Mar 2020

👍3

Thank you @paras55. your solution worked for me!

maximuslee1226 on 28 Mar 2020

Installing v2.7.0 should work as well.

LysandreJik on 1 Apr 2020

2.7.0 fails with the same error (at least with tokenizers==0.5.2)

ypapanik on 2 Apr 2020

Was this page helpful?

0 / 5 - 0 ratings

Related issues

Problem about convert TF model and pretraining

zhezhaoa · 3Comments

TypeError: '<' not supported between instances of 'NoneType' and 'int'

quocnle · 3Comments

Sudden catastrophic classification output during NER training

fabiocapsouza · 3Comments

Need a Restore training mechenisim in run_lm_finetuning.py

chuanmingliu · 3Comments

if crf needed when do ner?

alphanlp · 3Comments