Fairseq: Base-size pre-trained models

Created on 27 Jan 2020 · 5Comments · Source: pytorch/fairseq

❓ Questions and Help

What is your question?

1) Does Bart offer base-size(6-layer encoder, 6-layer decoder, hidden size 768) pre-trained models? Since in the summarization task, the baseline BERTSUMABS is trained on bert-base(12-layer encoder, 6-layer decoder, both hidden size 768), have you ever compared base-size Bart with it?

2) Could you please offer a README file for XSum (similar with the CNN one)?

3) How much time does the XSum fine-tuning take with smaller GPUs (like 4 11GB GPUs)?

@myleott @yinhanliu @ngoyal2707

question

Source

XinnuoXu

👍3

Most helpful comment

Will the Bart base-size(6-layer encoder, 6-layer decoder, hidden size 768) pre-trained models be released? I would like to play with them and it is hard for me to fine-tune the large model.

ricardorei on 17 Feb 2020

👍3

All 5 comments

our base model is trained on wiki-bookcorpus only.
will do
we use 16 32gpus for 1 hour (30K steps). so in your case it is 8 hours.

yinhanliu on 30 Jan 2020

👍2

@XinnuoXu Hi, Have you evaluated the bart.large.cnn model? Did you get the same R-2 score on CNN/DM datase as published? I used pre-trained model to fine-tune CNN/DM training. But the ROUGE-2 is 19.19 (R-2 in published paper is 21.28).
Thank you very much!

YizhuLiu on 9 Feb 2020

@YizhuLiu you need to use the right max-len, min-len, Len-penalty and beam size values.

yinhanliu on 9 Feb 2020

@yinhanliu Thank you for your reply. We set these values as shown in "Evaluating the bart.large.cnn model": beam=4, lenpen=2.0, max_len_b=140, min_len=55. With this setting, the R-2 score is 20.03. Are they right? If not, how can I get the same R-2 score on CNN/DM as published?

YizhuLiu on 10 Feb 2020

Will the Bart base-size(6-layer encoder, 6-layer decoder, hidden size 768) pre-trained models be released? I would like to play with them and it is hard for me to fine-tune the large model.

ricardorei on 17 Feb 2020

👍3

Was this page helpful?

0 / 5 - 0 ratings