Transformers: SummarizationPipeline crashes

Created on 21 May 2020 · 12Comments · Source: huggingface/transformers

summarize = pipeline("summarization")
summarize("Sam Shleifer writes the best docstring examples in the whole world.")

➡️

/usr/local/lib/python3.6/dist-packages/transformers/pipelines.py in _parse_and_tokenize(self, pad_to_max_length, *args, **kwargs)
    461         # Parse arguments
    462         inputs = self._args_parser(*args, **kwargs)
--> 463         inputs = self.tokenizer.batch_encode_plus(
    464             inputs, add_special_tokens=True, return_tensors=self.framework, pad_to_max_length=pad_to_max_length,
    465         )

AttributeError: 'dict' object has no attribute 'batch_encode_plus'

Pipeline Summarization

Source

julien-c

Most helpful comment

Yeah that sounds like this issue. It will be fixed in the next release or you can build from source with

git clone [this repo]
pip install -e .

sshleifer on 28 May 2020

👍3

All 12 comments

Is this issue fixed in version 2.10.0?

dalonlobo on 27 May 2020

@julien-c I still get the same error when doing

summarizer = pipeline('summarization')

and using it to summarize.

However the following explicitely works for me:

summarizer = pipeline('summarization', model='bart-large-cnn', tokenizer='bart-large-cnn')

dipanjanS on 28 May 2020

Yeah that sounds like this issue. It will be fixed in the next release or you can build from source with

git clone [this repo]
pip install -e .

sshleifer on 28 May 2020

👍3

Yeah that sounds like this issue. It will be fixed in the next release or you can build from source with
git clone [this repo]
pip install -e .

I have installed the package from GitHub repo but still have the same issue right now.

khalilRhouma on 2 Jun 2020

@khalilRhouma: It works for me at commit d976ef262e0b2c52363d201b2e14e5ecc42abbb3 , so you may need to git pull or some such. If that doesn't work I would love to see the output of

transformers-cli env

sshleifer on 2 Jun 2020

@sshleifer I get this error when I clone with that commit ID.
KeyError: "Unknown task summarization, available tasks are ['feature-extraction', 'sentiment-analysis', 'ner', 'question-answering', 'fill-mask']"
@dipanjanS Would be great to know what configuration you used

Dhanasekar-S on 16 Jun 2020

current master should also work.

sshleifer on 16 Jun 2020

@sshleifer The kernel still crashes
Attaching the code.

from transformers import pipeline
import torch

!git clone https://github.com/huggingface/transformers.git
%cd transformers
`!pip` install -e ".[dev]"

#summarizer = pipeline("summarization")
summarizer = pipeline('summarization', model='facebook/bart-large-cnn', tokenizer='facebook/bart-large-cnn') ##Kernel dies after running this line

transformers version - 2.11.0
torch - 1.5.0

Dhanasekar-S on 16 Jun 2020

Can't replicate :(.
Can I see your transformers-cli env output?

sshleifer on 16 Jun 2020

How do I get that output? I'm running these on Jupyter without any virtual env

Dhanasekar-S on 16 Jun 2020

Got it.