Hello, I'm trying to using seq2seq model (such as bart and EncoderDecoderModel(bert2bert))
And I'm little bit confused about input_ids, decoder_input_ids, tgt in model inputs.
As I know in seq2seq model, decoder_input should have special token(\ or something) before the sentence and target should have special token(\ or somethin) after the sentence. for example, decoder_input = <s> A B C D E , target = A B C D E</s>
so my question is
add_special_tokens=True for encoder input_ids and put \ or \input = a b c d e, decoder_input = <s>A B C D E, target = A B C D E</s>Hi @jungwhank
for Bert2Bert, pad_token is used as decoder_start_token_id and the input_ids and labels begin with cls_token_id ([CLS] for bert ) and end with sep_token_id ([SEP] for bert).
For training all you need to do is
input_text = "some input text"
target_text = "some target text"
input_ids = tokenizer(input_text, add_special_tokens=True, return_tensors="pt")["input_ids"]
target_ids = tokenizer(target_text, add_special_tokens=True, return_tensors="pt")["input_ids"]
model(input_ids=input_ids, decoder_input_ids=target_ids, labels=target_ids)
The EncoderDecoderModel class takes care adding pad_token to the decoder_input_ids.
for inference
model.generate(input_ids, decoder_start_token_id=model.config.decoder.pad_token_id)
Hope this clarifies your question. Also pinging @patrickvonplaten for more info.
Hi, @patil-suraj
Thanks for answering.
is it same for BartForConditionalGeneration?
Actually, I wanna do kind of translation task and is it same decoder_inputs_ids and labels?
@patil-suraj's answer is correct! For the EncoderDecoder framework, one should set model.config.decoder_start_token_id to the BOS token (which in BERT's case does not exist so that we simply use CLS token).
Bart is a bit different:
model.generate(input_ids). input_ids always refer to the encoder input tokens for Seq2Seq models and it depends on you if you want to add special tokens or not - this is not done automatically in the generate function.input_ids and decoder_input_ids and in this case the decoder_input_ids should start with Bart's decoder_start_token_id model.config.decoder_start_token_id:model(input_ids, decoder_input_ids=decoder_input_ids)
@patrickvonplaten
thanks for answering!
But I have a question that Is there decoder_start_token_id in BartConfig?
Should I just make my decoder_input_ids start with Bart's model.config.bos_token_id or set model.config.decoder_start_token_id = token_id?
I think I solved the problem. Thanks
@jungwhank Great ! Consider joining the awesome HF forum , if you haven't already :) It's the best place to ask such questions. The whole community is there to help you and your questions will also help the community.
Most helpful comment
@jungwhank Great ! Consider joining the awesome HF forum , if you haven't already :) It's the best place to ask such questions. The whole community is there to help you and your questions will also help the community.