Transformers: An Error report about pipeline

Created on 11 Mar 2020  ·  8Comments  ·  Source: huggingface/transformers

🐛 Bug

Information

This may be an easy question, but it has been bothering me all day.

When I run the code:
nlp = pipeline("question-answering")

It always tells me:
Couldn't reach server at 'https://s3.amazonaws.com/models.huggingface.co/bert/distilbert-base-cased-distilled-squad-modelcard.json' to download model card file.
Creating an empty model card.

If I ignore it and continue to run the rest of the code:
nlp({
'question': 'What is the name of the repository ?',
'context': 'Pipeline have been included in the huggingface/transformers repository'
})

The error will appear:
KeyError: 'token_type_ids'

Pipeline Version mismatch

Most helpful comment

use :
pip install transformers==2.5.1
instead of :
pip install transformers

All 8 comments

I have this same issue, but have no problems running:

nlp = pipeline("question-answering")

Note: To install the library, I had to install tokenizers version 0.6.0 separately, git clone the transformers repo and edit the setup.py file before installing as per @dafraile's answer for issue: https://github.com/huggingface/transformers/issues/2831

Update: This error was fixed when I installed tokenizers==0.5.2

I sadly have this issue too with the newest transformers 2.6.0 version.

Tokenizers is at version 0.5.2. But newest version of tokenizers sadly also doesn't work.

And solutions to fix this issue?

I have the same issue here. I first ran with my own tokenizer, but it failed, and then I tried to run the 03-pipelines.ipynb code with QnA example and I get the following error code.

Environment:
tensorflow==2.0.0
tensorflow-estimator==2.0.1
tensorflow-gpu==2.0.0
torch==1.4.0
transformers==2.5.1
tokenizers==0.6.0

Code that I ran:
nlp_qa = pipeline('question-answering')
nlp_qa(context='Hugging Face is a French company based in New-York.', question='Where is based Hugging Face ?')

Error code:

HBox(children=(FloatProgress(value=0.0, description='Downloading', max=230.0, style=ProgressStyle(description_…

convert squad examples to features: 0%| | 0/1 [00:00

RemoteTraceback Traceback (most recent call last)
RemoteTraceback:
"""
Traceback (most recent call last):
File "/home/brandon/anaconda3/envs/transformers/lib/python3.7/multiprocessing/pool.py", line 121, in worker
result = (True, func(args, *kwds))
File "/home/brandon/anaconda3/envs/transformers/lib/python3.7/multiprocessing/pool.py", line 44, in mapstar
return list(map(*args))
File "/home/brandon/anaconda3/envs/transformers/lib/python3.7/site-packages/transformers/data/processors/squad.py", line 198, in squad_convert_example_to_features
p_mask = np.array(span["token_type_ids"])
KeyError: 'token_type_ids'
"""

The above exception was the direct cause of the following exception:

KeyError Traceback (most recent call last)
in ()
1 nlp_qa = pipeline('question-answering')
----> 2 nlp_qa(context='Hugging Face is a French company based in New-York.', question='Where is based Hugging Face ?')

~/anaconda3/envs/transformers/lib/python3.7/site-packages/transformers/pipelines.py in __call__(self, texts, *kwargs)
968 False,
969 )
--> 970 for example in examples
971 ]
972 all_answers = []

~/anaconda3/envs/transformers/lib/python3.7/site-packages/transformers/pipelines.py in (.0)
968 False,
969 )
--> 970 for example in examples
971 ]
972 all_answers = []

~/anaconda3/envs/transformers/lib/python3.7/site-packages/transformers/data/processors/squad.py in squad_convert_examples_to_features(examples, tokenizer, max_seq_length, doc_stride, max_query_length, is_training, return_dataset, threads)
314 p.imap(annotate_, examples, chunksize=32),
315 total=len(examples),
--> 316 desc="convert squad examples to features",
317 )
318 )

~/anaconda3/envs/transformers/lib/python3.7/site-packages/tqdm/std.py in __iter__(self)
1106 fp_write=getattr(self.fp, 'write', sys.stderr.write))
1107
-> 1108 for obj in iterable:
1109 yield obj
1110 # Update and possibly print the progressbar.

~/anaconda3/envs/transformers/lib/python3.7/multiprocessing/pool.py in (.0)
323 result._set_length
324 ))
--> 325 return (item for chunk in result for item in chunk)
326
327 def imap_unordered(self, func, iterable, chunksize=1):

~/anaconda3/envs/transformers/lib/python3.7/multiprocessing/pool.py in next(self, timeout)
746 if success:
747 return value
--> 748 raise value
749
750 __next__ = next # XXX

KeyError: 'token_type_ids'

Any help would be greatly appreciated!

use :
pip install transformers==2.5.1
instead of :
pip install transformers

Thank you @paras55. your solution worked for me!

Installing v2.7.0 should work as well.

2.7.0 fails with the same error (at least with tokenizers==0.5.2)

Was this page helpful?
0 / 5 - 0 ratings

Related issues

quocnle picture quocnle  ·  3Comments

alphanlp picture alphanlp  ·  3Comments

adigoryl picture adigoryl  ·  3Comments

rsanjaykamath picture rsanjaykamath  ·  3Comments

lemonhu picture lemonhu  ·  3Comments