Is there a way to generate using pre-trained BART like one in
https://huggingface.co/blog/how-to-generate
I am currently using BART for a generation task but finetuning it
I was wondering if it's possible to see generation result from pre-trained BART
Bart is a encoder-decoder model. So it should be rather used as translating one sequence to another one. This means that the generation method expects input_ids and creates decoder_input_ids.
Maybe you can take a look at this: https://sshleifer.github.io/blog_v2/jupyter/2020/03/12/bart.html
I think I might have found a potential issue with BartForConditionalGeneration. In zero-shot setup, the vanilla bart-large model produces gibberish, while the bart-large-cnn can generate fluent language. I think the problem is with the default setup on output_past attribute of BartConfig
Example:
from transformers import AutoTokenizer, BartForConditionalGeneration
model_name_or_path = 'bart-large'
tokenizer = AutoTokenizer.from_pretrained(model_name_or_path)
model = BartForConditionalGeneration(model_name_or_path)
text = "Trump falsely denied that he claimed governors from certain states"
input_ids = tokenizer.batch_encode_plus([text], return_tensors='pt')['input_ids']
output = model.generate(input_ids=input_ids, max_length=50, num_beams=1)
print(tokenizer.decode(output[0]))
If model_name_or_path="bart-large", the result will be <s>Mr\'<s>Mr\'Mr"Mr""<s>Mr"Mr"\'Mr"<s>Mr"<s>Mr"<s>Mr"<s>Mr"Mr"<s>Mr"<s>Mr\'Mr"\'Mr"Mr"\'Mr"Mr.
If it is set to bart-large-cnn, the result will be </s><s><s><s>Trump falsely denied that he claimed governors from certain states. Trump falsely denied he claimed that he had been in contact with governors from some states. He also falsely denied saying he had met with governors of certain states in the past. Trump
But once I override the output_past flag in config, the result of bart-large will be normal:
config = BartConfig.from_pretrained('bart-large')
config.output_past = True
model = BartForConditionalGeneration(model_name_or_path, config=config)
...
Result would be: <s>MrThreatening to deport immigrants from certain states</s>
This seems to be related to autoregressive decoding where the decoder states need to be cached. Not sure if this is intended so that bart-large is always used as a masked language model, correct me if I'm wrong.
Thanks Xinyu . I owe you a drink :)
I think I might have found a potential issue with
BartForConditionalGeneration. In zero-shot setup, the vanillabart-largemodel produces gibberish, while thebart-large-cnncan generate fluent language. I think the problem is with the default setup onoutput_pastattribute ofBartConfigExample:
from transformers import AutoTokenizer, BartForConditionalGeneration model_name_or_path = 'bart-large' tokenizer = AutoTokenizer.from_pretrained(model_name_or_path) model = BartForConditionalGeneration(model_name_or_path) text = "Trump falsely denied that he claimed governors from certain states" input_ids = tokenizer.batch_encode_plus([text], return_tensors='pt')['input_ids'] output = model.generate(input_ids=input_ids, max_length=50, num_beams=1) print(tokenizer.decode(output[0]))If
model_name_or_path="bart-large", the result will be<s>Mr\'<s>Mr\'Mr"Mr""<s>Mr"Mr"\'Mr"<s>Mr"<s>Mr"<s>Mr"<s>Mr"Mr"<s>Mr"<s>Mr\'Mr"\'Mr"Mr"\'Mr"Mr.If it is set to
bart-large-cnn, the result will be</s><s><s><s>Trump falsely denied that he claimed governors from certain states. Trump falsely denied he claimed that he had been in contact with governors from some states. He also falsely denied saying he had met with governors of certain states in the past. TrumpBut once I override the
output_pastflag in config, the result ofbart-largewill be normal:config = BartConfig.from_pretrained('bart-large') config.output_past = True model = BartForConditionalGeneration(model_name_or_path, config=config) ...Result would be:
<s>MrThreatening to deport immigrants from certain states</s>This seems to be related to autoregressive decoding where the decoder states need to be cached. Not sure if this is intended so that
bart-largeis always used as a masked language model, correct me if I'm wrong.
@sshleifer - maybe you can answer this better than I can
@patrickvonplaten
>>> model = BartForConditionalGeneration(model_name_or_path, config=c)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: __init__() got multiple values for argument 'config'
Getting this error. Also is there a way to force a generation to contain prefix tokens?
i know fairseq has this feature
@tuhinjubcse
from_pretrained. You can pass in configuration options as keyword arugments.BartForConditionalGeneration.from_pretrained(model_name, **c.__dict__)
decoder_start_input_ids kwarg to generate@XinyuHua you are correct!
Idk the results look pretty bad to me @sshleifer
from transformers import AutoTokenizer, BartForConditionalGeneration ,BartConfig
c = BartConfig.from_pretrained('bart-large')
c.output_past = True
model_name_or_path = 'bart-large'
tokenizer = AutoTokenizer.from_pretrained(model_name_or_path)
model = BartForConditionalGeneration.from_pretrained(model_name_or_path, config=c)
text = "Milton scrunched his eyes and moodily turned back to his computer like a"
input_ids = tokenizer.batch_encode_plus([text], return_tensors='pt')['input_ids']
input_ids = tokenizer.batch_encode_plus([text], return_tensors='pt')['input_ids']
output = model.generate(input_ids=input_ids,do_sample=True,max_length=50,top_k=5,temperature=0.7)
print(tokenizer.decode(output[0]))
The output I got is MrMilton
I'm not super surprised, since 'bart-large' is not finetuned on a generative task.
@sshleifer do you suggest using a different checkpoint or model
The reason I am asking is I am fine tuning on a novel dataset created for a task
But I need to have a baseline where I wanted to see how BART pretrained does , coz based on GPT2 it seems it does decently on generative tasks
I think it depends on the task, but I haven't tried using bart for the "text continuation" type workflow. CTRL, GPT2, T5 could work better.
@sshleifer Let me be a bit clear
I wanted to do something like
text_input = “Milton scrunched his eyes and moodily turned back to his computer helpless”
text_output = “Milton scrunched his eyes and moodily turned back to his computer like a”
I want my output to contain text_output as a prefix
Normally when I was fine-tuning BART where I had paired data
Milton scrunched his eyes and moodily turned back to his computer helpless----->Milton scrunched his eyes and moodily turned back to his computer like a despondent child
The generation result was
Milton scrunched his eyes and moodily turned back to his computer like a child caught in the headlights
I want to be able to get some results without fine-tuning and just using pretrained BART to compare. How do I do that?
The short answer is I don't know, we don't have that use case supported with Bart.
For now I am going to close this, but feel free to open a discussion issue about your task.
Most helpful comment
I think I might have found a potential issue with
BartForConditionalGeneration. In zero-shot setup, the vanillabart-largemodel produces gibberish, while thebart-large-cnncan generate fluent language. I think the problem is with the default setup onoutput_pastattribute ofBartConfigExample:
If
model_name_or_path="bart-large", the result will be<s>Mr\'<s>Mr\'Mr"Mr""<s>Mr"Mr"\'Mr"<s>Mr"<s>Mr"<s>Mr"<s>Mr"Mr"<s>Mr"<s>Mr\'Mr"\'Mr"Mr"\'Mr"Mr.If it is set to
bart-large-cnn, the result will be</s><s><s><s>Trump falsely denied that he claimed governors from certain states. Trump falsely denied he claimed that he had been in contact with governors from some states. He also falsely denied saying he had met with governors of certain states in the past. TrumpBut once I override the
output_pastflag in config, the result ofbart-largewill be normal:Result would be:
<s>MrThreatening to deport immigrants from certain states</s>This seems to be related to autoregressive decoding where the decoder states need to be cached. Not sure if this is intended so that
bart-largeis always used as a masked language model, correct me if I'm wrong.