Transformers: ValueError: You have to specify either input_ids or inputs_embeds!

Created on 4 Apr 2020 · 19Comments · Source: huggingface/transformers

Details

I'm quite new to NLP task. However, I was trying to train the T5-large model and set things as follows. But unfortunately, I've got an error.

def build_model(transformer, max_len=512):
    input_word_ids = Input(shape=(max_len,), dtype=tf.int32, name="input_word_ids")
    sequence_output = transformer(input_word_ids)[0]
    cls_token = sequence_output[:, 0, :]
    out = Dense(1, activation='sigmoid')(cls_token)
    model = Model(inputs=input_word_ids, outputs=out)
    return model

model = build_model(transformer_layer, max_len=MAX_LEN)

It thorws

ValueError: in converted code:
ValueError                                Traceback (most recent call last)
<ipython-input-19-8ad6e68cd3f5> in <module>
----> 5     model = build_model(transformer_layer, max_len=MAX_LEN)
      6 
      7 model.summary()

<ipython-input-17-e001ed832ed6> in build_model(transformer, max_len)
     31     """
     32     input_word_ids = Input(shape=(max_len,), dtype=tf.int32, name="input_word_ids")
---> 33     sequence_output = transformer(input_word_ids)[0]
     34     cls_token = sequence_output[:, 0, :]
     35     out = Dense(1, activation='sigmoid')(cls_token)
ValueError: You have to specify either input_ids or inputs_embeds

Source

innat

Most helpful comment

@ratthachat - thanks for you message!
We definitely need to provide more TF examples for the T5 Model. I want to tackle this problem in ~2 weeks.

In TF we use the naming convention inputs, so the you should change to model.fit({"inputs": x_encoder}) . I very much agree that the error message is quite misleading and correct it in this PR: #4401.

patrickvonplaten on 16 May 2020

❤1 👍1

All 19 comments

Hi @innat,

T5 is an encoder-decoder model so you will have to provide both input_ids and decoder_input_ids to the model. Maybe taking a look at the T5 docs (especially the "Examples") can help you :-)

patrickvonplaten on 5 Apr 2020

Just noticed that the Examples docstring for TF T5 was wrong. Is fixed with #3636 .

patrickvonplaten on 5 Apr 2020

@patrickvonplaten
hello, sorry to bother you. Would you please justify the following piece of code:

Imports

from transformers import TFAutoModel, AutoTokenizer

# First load the real tokenizer
tokenizer = AutoTokenizer.from_pretrained('t5-small')
transformer_layer = TFAutoModel.from_pretrained('t5-small')

Define Encoder

def encode(texts, tokenizer, maxlen=512):
    enc_di = tokenizer.batch_encode_plus(
        texts, 
        return_attention_masks=False, 
        return_token_type_ids=False,
        pad_to_max_length=True,
        max_length=maxlen
    )
    return np.array(enc_di['input_ids'])

# tokenized
x_train = encode('text', tokenizer, maxlen=200)
y_train

Define Model and Call

def build_mod(transformer, max_len=512):
    input_word_ids = Input(shape=(max_len,), dtype=tf.int32, name="input_word_ids")
    sequence_output = transformer(input_word_ids)[0]
    cls_token = sequence_output[:, 0, :]
    out = Dense(1, activation='sigmoid')(cls_token)

    model = Model(inputs=input_word_ids, outputs=out)
    model.compile(Adam(lr=1e-5), loss='binary_crossentropy', metrics=['accuracy'])

    return model

# calling
model = build_model(transformer_layer, max_len=200)

Now, according to the docstring, should I do,

outputs = model(input_ids=x_train, decoder_input_ids=x_train)[0]

innat on 5 Apr 2020

I'm not 100% sure what you want to do here exactly. T5 is always trained in a text-to-text format. We have a section here on how to train T5: https://huggingface.co/transformers/model_doc/t5.html#training

Otherwise I'd recommend taking a look at the official paper.

patrickvonplaten on 5 Apr 2020

❤1

@patrickvonplaten Thanks for this. I encountered the same issue and this resolved it!

I'm wondering if it makes sense to make the error message capture the requirement of having both input_ids and decoder_input_ids since this is an encoder-decoder model? This may make the fix clearer for users of encoder decoder models in the future.

I.e., for encoded-decoder models, switch the error message from:

ValueError: You have to specify either input_ids or inputs_embeds

to:

ValueError: You have to specify either (input_ids and decoder_input_ids) or inputs_embeds

I can sent this as a PR as well if you think it makes sense!

enzoampil on 28 Apr 2020

Hi @enzoampil,

A PR for a cleaner Error message would be nice if you feel like it :-). It would be good if the error message could change between ValueError: You have to specify either input_ids or inputs_embeds if self.is_decoder == False and ValueError: You have to specify either decoder_input_ids or decoder_inputs_embeds if self.is_decoder == True. So adding a simple if statement to the error message is definitely a good idea!

patrickvonplaten on 28 Apr 2020

👍1

Got it will do. Thanks for the pointers! 😄

enzoampil on 29 Apr 2020

Hi, I also got the same error when training seq2seq on tf.keras and I could not follow the example you provide on https://huggingface.co/transformers/model_doc/t5.html#training (this example is for pytorch I think)

I create x_encoder asinput_ids and x_decoder_in for decoder_input_ids

model = TFT5Model.from_pretrained('t5-base')
model.compile('adam',loss='sparse_binary_crossentropy')

So when I want to train the model I simply do
model.fit({'input_ids': x_encoder, 'decoder_input_ids': x_decoder_in})

where I clearly provide input_ids , but still got this error message :
ValueError: You have to specify either input_ids or inputs_embeds

Note that changing input from dict to list got the same error. Changing model from TFT5Model to TFT5ForConditionalGeneration got the same error. Changing loss to BCE got the same error.

Moreover, changing input to only one array
model.fit({'input_ids': x_encoder})
is also error :

ValueError: No data provided for "decoder_input_ids". Need data for each key in: ['decoder_input_ids', 'input_ids']

ratthachat on 7 May 2020

In class TFT5Model(TFT5PreTrainedModel):

I found this line (899-900):
```

    # retrieve arguments
    input_ids = kwargs.get("inputs", None)

    ```

Shouldn't it be kwargs.get("input_ids", None) ??

ratthachat on 7 May 2020

@ratthachat - thanks for you message!
We definitely need to provide more TF examples for the T5 Model. I want to tackle this problem in ~2 weeks.

patrickvonplaten on 16 May 2020

❤1 👍1

Thanks for your consideration, Patrick!

ratthachat on 18 May 2020

@patrickvonplaten Sorry to tag you in this old thread, but is there any official T5 TF example (as you mentioned in the last thread)?

ratthachat on 7 Aug 2020

@ratthachat - no worries, we should definitely add more TF T5 examples and we still don't have a good TF T5 notebook.
I am moving the discussion to the forum and if no one answers I will spent some time coping a T5 PT notebook to TF.

patrickvonplaten on 8 Aug 2020

Hi @patrickvonplaten i wanted to fine tune using T5 using TF 2.0 but its soo confusing at each end as compared to pytorch which is really well documented all current examples (community + offcial) are for pytorch. is the work for TFT5 notebook underway?

HarrisDePerceptron on 28 Aug 2020

Okey, seems like no-one has a complete TF T5 notebook. I will start working on it this week: https://discuss.huggingface.co/t/how-to-train-t5-with-tensorflow/641/6

Should be done by next week sometime :-)

patrickvonplaten on 1 Sep 2020

Hi @patrickvonplaten
Please help me with this error.

I'm doing inference with a T5-base model which I finetuned on GLUE tasks.

It's giving error like
ValueError: You have to specify either decoder_input_ids or decoder_inputs_embeds

While doing inference, we just need to provide input_ids for the encoder right?
Why do we need decoder_input_ids?

And as it's inference, my labels will also be None.
So, this part will not execute.
decoder_input_ids = self._shift_right(labels)

Waiting for your reply.
Thank you.

prashant-kikani on 15 Sep 2020

@prashant-kikani it is indeed a strange behavior. have you tried passing input_ids to decoder_input_ids like:

input_ids = tokenizer(..., return_tensor='tf')   # replace pt for pytorch
outputs= model(input_ids=input_ids, decoder_input_ids=input_ids)

assert len(outputs)==3, 'must return 3 tensors when inferencing'

HarrisDePerceptron on 15 Sep 2020

Hi @HarrisDePerceptron
We can do it & it's giving some output also. But it's not the right thing to do.

You see, T5 which Transformer itself, is a text to text model.
So, it can do inference in linear time by matrix multiplication when label is available.

But, when label is not available, we need to go sequentially by doing forward pass in decoder for each word till </s> doesn't come.
We need to concatenate last output of decoder with new input if decoder each time.

What do you think?

prashant-kikani on 16 Sep 2020

@prashant-kikani @HarrisDePerceptron

For decoder_input_ids , we just need to put a single BOS token so that the decoder will know that this is the beginning of the output sentence. (Even in GLUE task, T5 still looks at every output label as a complete sentence )

We can see a concrete example by looking at the function
prepare_inputs_for_generation which is called by model.generate
(generate function is here : https://github.com/huggingface/transformers/blob/master/src/transformers/generation_tf_utils.py )

See line 298 in the above link :

if self.config.is_encoder_decoder:
            if decoder_start_token_id is None:
                decoder_start_token_id = bos_token_id

and line 331:

# create empty decoder_input_ids
            input_ids = (
                tf.ones(
                    (effective_batch_size * num_beams, 1),
                    dtype=tf.int32,
                )
                * decoder_start_token_id
            )

and see T5's prepare_inputs_for_generation which change the above input_ids into decoder_input_ids implementation at :
https://github.com/huggingface/transformers/blob/08f534d2da47875a4b7eb1c125cfa7f0f3b79642/src/transformers/modeling_tf_t5.py#L1367

ratthachat on 27 Oct 2020

Was this page helpful?

0 / 5 - 0 ratings