Transformers: Can I training a bart model from scratch by transformers?

Created on 18 Jun 2020  路  7Comments  路  Source: huggingface/transformers

Can I training a bart model from scratch by transformers?

Most helpful comment

So from the paper: https://arxiv.org/pdf/1910.13461.pdf, you can see that Bart is trained on denoising input sequences in almost any possible way.

One way could be for BartForConditionalGeneration:

from transformers import BartTokenizer, BartForConditionalGeneration, BartConfig

tok = BartTokenizer.from_pretrained("facebook/bart-large")
model = BartForConditionalGeneration(BartConfig())

input_string = "My dog is <mask> </s>"
decoder_input_string = "<s> My dog is cute"
labels_string = "My dog is cute </s>"

input_ids = tok(input_string, add_special_tokens=False, return_tensors="pt").input_ids
decoder_input_ids =tok(decoder_input_string, add_special_tokens=False, return_tensors="pt").input_ids
labels = tok(labels_string, add_special_tokens=False, return_tensors="pt").input_ids

loss = model(input_ids=input_ids, decoder_input_ids=decoder_input_ids, labels=labels)[0]

All 7 comments

Yes

Yes

That' s awesome!Can you give a code to show? I'm grateful!

So from the paper: https://arxiv.org/pdf/1910.13461.pdf, you can see that Bart is trained on denoising input sequences in almost any possible way.

One way could be for BartForConditionalGeneration:

from transformers import BartTokenizer, BartForConditionalGeneration, BartConfig

tok = BartTokenizer.from_pretrained("facebook/bart-large")
model = BartForConditionalGeneration(BartConfig())

input_string = "My dog is <mask> </s>"
decoder_input_string = "<s> My dog is cute"
labels_string = "My dog is cute </s>"

input_ids = tok(input_string, add_special_tokens=False, return_tensors="pt").input_ids
decoder_input_ids =tok(decoder_input_string, add_special_tokens=False, return_tensors="pt").input_ids
labels = tok(labels_string, add_special_tokens=False, return_tensors="pt").input_ids

loss = model(input_ids=input_ids, decoder_input_ids=decoder_input_ids, labels=labels)[0]

Pinging @sshleifer to make sure I did not forget anything

Pinging @sshleifer to make sure I did not forget anything

Actually, I was going to ask. how train a model from zero to one. For example, I want to train a Chinese bart model.

Here's a working example for this, including batching:

from transformers import BartTokenizer, BartForConditionalGeneration, BartConfig

tok = BartTokenizer.from_pretrained("facebook/bart-large")
model = BartForConditionalGeneration(BartConfig())

input_batch = ["My dog is <mask></s>", "It loves to play in the <mask></s>"]
decoder_input_batch = ["<s>My dog is cute", "<s>It loves to play in the park"]
labels_batch = ["My dog is cute</s>", "It loves to play in the park</s>"]

input_ids = tok.batch_encode_plus(input_batch, add_special_tokens=False, return_tensors="pt", padding=True).input_ids
decoder_input_ids = tok.batch_encode_plus(decoder_input_batch, add_special_tokens=False, return_tensors="pt", padding=True).input_ids
labels = tok.batch_encode_plus(labels_batch, add_special_tokens=False, return_tensors="pt", padding=True).input_ids

loss = model(input_ids=input_ids, decoder_input_ids=decoder_input_ids, labels=labels)[0]

>>> tensor(10.9981, device='cuda:0', grad_fn=<NllLossBackward>)

Here's a working example for this, including batching:

from transformers import BartTokenizer, BartForConditionalGeneration, BartConfig

tok = BartTokenizer.from_pretrained("facebook/bart-large")
model = BartForConditionalGeneration(BartConfig())

input_batch = ["My dog is <mask></s>", "It loves to play in the <mask></s>"]
decoder_input_batch = ["<s>My dog is cute", "<s>It loves to play in the park"]
labels_batch = ["My dog is cute</s>", "It loves to play in the park</s>"]

input_ids = tok.batch_encode_plus(input_batch, add_special_tokens=False, return_tensors="pt", padding=True).input_ids
decoder_input_ids = tok.batch_encode_plus(decoder_input_batch, add_special_tokens=False, return_tensors="pt", padding=True).input_ids
labels = tok.batch_encode_plus(labels_batch, add_special_tokens=False, return_tensors="pt", padding=True).input_ids

loss = model(input_ids=input_ids, decoder_input_ids=decoder_input_ids, labels=labels)[0]

>>> tensor(10.9981, device='cuda:0', grad_fn=<NllLossBackward>)

input_batch = ["My dog is ", "It loves to play in the "]
decoder_input_batch = ["My dog is cute", "It loves to play in the park"]
labels_batch = ["My dog is cute
", "It loves to play in the park
"]

If I have a text document, each line of a paragraph, how do I rewrite the data input on it? Thanks!

Was this page helpful?
0 / 5 - 0 ratings

Related issues

rsanjaykamath picture rsanjaykamath  路  3Comments

yspaik picture yspaik  路  3Comments

zhezhaoa picture zhezhaoa  路  3Comments

HansBambel picture HansBambel  路  3Comments

chuanmingliu picture chuanmingliu  路  3Comments