Transformers: Distil-BART?

Created on 29 Mar 2020  Â·  18Comments  Â·  Source: huggingface/transformers

Distillation seq2seq

Most helpful comment

Yes, it is will be great! Any updates?

All 18 comments

Interesting idea! What do you think @thomwolf ?

Hi, any update on this, even partial code?

I'm gunna take a crack this weekend, hopefully, By starting from the distilbert example and modifying. I'll post a branch if I make meaningful progress.

Hi, just checking in to see if there's a branch already (couldn't find it). Thanks!

Yes, it is will be great! Any updates?

I'm gunna wait until the code is stable/reusable to release it, sorry for the change of plans.

As per https://twitter.com/sam_shleifer/status/1276160367853547522, it looks like distilBART has been released :)

https://huggingface.co/sshleifer/distilbart-cnn-12-6# the tokenizer by name sshleifer/distilbart-cnn-12-6 leads to an error, works with facebook/bart-cnn-large-tokenizer

I've faced the same issue with sshleifer/distilbart-cnn-12-6


Best regards,
Vladislav Kozlenko
[image: phone:] +380 685954166

[image: skype:]
[email protected] vladislavko@softwareplanet.uk.com
[image: position:] Software Engineer

[image: SP Group]
Software Planet Group
Company No: 9428594
[image: phone:] +44 1483 80 24 23

[image: location:] Ukraine, Cherkasy 18000
[image: site:] softwareplanetgroup.com

This email and any files transmitted with it are confidential and intended
solely for the use of the individual or entity to whom they are addressed.
If you have received this email in error please notify the sender
immediately.

On Thu, 25 Jun 2020 at 20:42, Amanpreet Singh notifications@github.com
wrote:

https://huggingface.co/sshleifer/distilbart-cnn-12-6# the tokenizer by
name sshleifer/distilbart-cnn-12-6 leads to an error, works with
facebook/bart-cnn-large-tokenizer

—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
https://github.com/huggingface/transformers/issues/3503#issuecomment-649724552,
or unsubscribe
https://github.com/notifications/unsubscribe-auth/AJWNBQIUYRMVUXUT6B42V4LRYOD67ANCNFSM4LVX67LQ
.

I can't reproduce the error on master. If somebody can, it would be great if they could make a separate issue and I will try to resolve.

All the distilbart- tokenizers are identical to the is identical to the facebook/bart-large-cnn tokenizer, which is identical to the facebook/bart-cnn-xsum` tokenizer. @julien-c is there a fancy AWS way to synchronize/symlink them?

I've tried several models and I'm getting the same error each time it
creates a tokenizer.
[image: image.png]


Best regards,
Vladislav Kozlenko
[image: phone:] +380 685954166

[image: skype:]
[email protected] vladislavko@softwareplanet.uk.com
[image: position:] Software Engineer

[image: SP Group]
Software Planet Group
Company No: 9428594
[image: phone:] +44 1483 80 24 23

[image: location:] Ukraine, Cherkasy 18000
[image: site:] softwareplanetgroup.com

This email and any files transmitted with it are confidential and intended
solely for the use of the individual or entity to whom they are addressed.
If you have received this email in error please notify the sender
immediately.

On Thu, 25 Jun 2020 at 21:34, Sam Shleifer notifications@github.com wrote:

I can't reproduce the error on master. If somebody can, it would be great
if they could make a separate issue and I will try to resolve.

All the distilbart- tokenizers are identical to the is identical to the
facebook/bart-large-cnn tokenizer, which is identical to the
facebook/bart-cnn-xsum` tokenizer. @julien-c https://github.com/julien-c
is there a fancy AWS way to synchronize/symlink them?

—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
https://github.com/huggingface/transformers/issues/3503#issuecomment-649748636,
or unsubscribe
https://github.com/notifications/unsubscribe-auth/AJWNBQKQZXUC4YAY7WWJQ7TRYOKCVANCNFSM4LVX67LQ
.

@vladislavkoz Please make a new issue with instructions to reproduce, following the issue template. Feel free to assign me.

Here is an issue https://github.com/huggingface/transformers/issues/5286


Best regards,
Vladislav Kozlenko
[image: phone:] +380 685954166

[image: skype:]
[email protected] vladislavko@softwareplanet.uk.com
[image: position:] Software Engineer

[image: SP Group]
Software Planet Group
Company No: 9428594
[image: phone:] +44 1483 80 24 23

[image: location:] Ukraine, Cherkasy 18000
[image: site:] softwareplanetgroup.com

This email and any files transmitted with it are confidential and intended
solely for the use of the individual or entity to whom they are addressed.
If you have received this email in error please notify the sender
immediately.

On Thu, 25 Jun 2020 at 21:38, Vladislav Kozlenko <
[email protected]> wrote:

I've tried several models and I'm getting the same error each time it
creates a tokenizer.
[image: image.png]


Best regards,
Vladislav Kozlenko
[image: phone:] +380 685954166

[image: skype:]
[email protected] vladislavko@softwareplanet.uk.com
[image: position:] Software Engineer

[image: SP Group]
Software Planet Group
Company No: 9428594
[image: phone:] +44 1483 80 24 23

[image: location:] Ukraine, Cherkasy 18000
[image: site:] softwareplanetgroup.com

This email and any files transmitted with it are confidential and
intended solely for the use of the individual or entity to whom they are
addressed. If you have received this email in error please notify the
sender immediately.

On Thu, 25 Jun 2020 at 21:34, Sam Shleifer notifications@github.com
wrote:

I can't reproduce the error on master. If somebody can, it would be great
if they could make a separate issue and I will try to resolve.

All the distilbart- tokenizers are identical to the is identical to the
facebook/bart-large-cnn tokenizer, which is identical to the
facebook/bart-cnn-xsum` tokenizer. @julien-c
https://github.com/julien-c is there a fancy AWS way to
synchronize/symlink them?

—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
https://github.com/huggingface/transformers/issues/3503#issuecomment-649748636,
or unsubscribe
https://github.com/notifications/unsubscribe-auth/AJWNBQKQZXUC4YAY7WWJQ7TRYOKCVANCNFSM4LVX67LQ
.

I didn't assign you. I Just read the message to late.


Best regards,
Vladislav Kozlenko
[image: phone:] +380 685954166

[image: skype:]
[email protected] vladislavko@softwareplanet.uk.com
[image: position:] Software Engineer

[image: SP Group]
Software Planet Group
Company No: 9428594
[image: phone:] +44 1483 80 24 23

[image: location:] Ukraine, Cherkasy 18000
[image: site:] softwareplanetgroup.com

This email and any files transmitted with it are confidential and intended
solely for the use of the individual or entity to whom they are addressed.
If you have received this email in error please notify the sender
immediately.

On Thu, 25 Jun 2020 at 21:45, Vladislav Kozlenko <
[email protected]> wrote:

Here is an issue https://github.com/huggingface/transformers/issues/5286


Best regards,
Vladislav Kozlenko
[image: phone:] +380 685954166

[image: skype:]
[email protected] vladislavko@softwareplanet.uk.com
[image: position:] Software Engineer

[image: SP Group]
Software Planet Group
Company No: 9428594
[image: phone:] +44 1483 80 24 23

[image: location:] Ukraine, Cherkasy 18000
[image: site:] softwareplanetgroup.com

This email and any files transmitted with it are confidential and
intended solely for the use of the individual or entity to whom they are
addressed. If you have received this email in error please notify the
sender immediately.

On Thu, 25 Jun 2020 at 21:38, Vladislav Kozlenko <
[email protected]> wrote:

I've tried several models and I'm getting the same error each time it
creates a tokenizer.
[image: image.png]


Best regards,
Vladislav Kozlenko
[image: phone:] +380 685954166

[image: skype:]
[email protected] vladislavko@softwareplanet.uk.com
[image: position:] Software Engineer

[image: SP Group]
Software Planet Group
Company No: 9428594
[image: phone:] +44 1483 80 24 23

[image: location:] Ukraine, Cherkasy 18000
[image: site:] softwareplanetgroup.com

This email and any files transmitted with it are confidential and
intended solely for the use of the individual or entity to whom they are
addressed. If you have received this email in error please notify the
sender immediately.

On Thu, 25 Jun 2020 at 21:34, Sam Shleifer notifications@github.com
wrote:

I can't reproduce the error on master. If somebody can, it would be
great if they could make a separate issue and I will try to resolve.

All the distilbart- tokenizers are identical to the is identical to the
facebook/bart-large-cnn tokenizer, which is identical to the
facebook/bart-cnn-xsum` tokenizer. @julien-c
https://github.com/julien-c is there a fancy AWS way to
synchronize/symlink them?

—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
https://github.com/huggingface/transformers/issues/3503#issuecomment-649748636,
or unsubscribe
https://github.com/notifications/unsubscribe-auth/AJWNBQKQZXUC4YAY7WWJQ7TRYOKCVANCNFSM4LVX67LQ
.

@sshleifer ATM you need to duplicate the tokenizer files in each model if you want them to be loadable by the model hub, the inference API, etc.

I was able to create tokenizer only with 'distilbart-xsum-12-1' and
'distilbart-xsum-9-6'. Then on the summarization step, I'm getting another
error. I've added a comment here
https://github.com/huggingface/transformers/issues/5286


Best regards,
Vladislav Kozlenko
[image: phone:] +380 685954166

[image: skype:]
[email protected] vladislavko@softwareplanet.uk.com
[image: position:] Software Engineer

[image: SP Group]
Software Planet Group
Company No: 9428594
[image: phone:] +44 1483 80 24 23

[image: location:] Ukraine, Cherkasy 18000
[image: site:] softwareplanetgroup.com

This email and any files transmitted with it are confidential and intended
solely for the use of the individual or entity to whom they are addressed.
If you have received this email in error please notify the sender
immediately.

On Fri, 26 Jun 2020 at 02:11, Julien Chaumond notifications@github.com
wrote:

@sshleifer https://github.com/sshleifer ATM you need to duplicate the
tokenizer files in each model if you want them to be loadable by the model
hub, the inference API, etc.

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/huggingface/transformers/issues/3503#issuecomment-649862320,
or unsubscribe
https://github.com/notifications/unsubscribe-auth/AJWNBQMZYWYTBBAM5PXRRHDRYPKTPANCNFSM4LVX67LQ
.

Hey @sshleifer , thanks for the distilled BART version I was able to fine tune it with the same script on BillSum dataset as T5 but the numbers are way different between the two. I just wanted to understand if I might be doing something wrong with regards to fine tuning distilBART, does it require student training everytime?
Reference numbers on BillSum Dataset:

T5-base:
avg_train_loss = tensor(1.5333, device='cuda:0')
avg_val_loss = tensor(1.4528, device='cuda:0')
epoch = 1
loss = tensor(1.6734, device='cuda:0')
rouge1 = 0.49188267841912325
rouge2 = 0.26436589848185027
rougeL = 0.3591894400892483
train_loss = tensor(1.6734, device='cuda:0')
val_loss = tensor(1.4528, device='cuda:0')

dBART-cnn-12-6:
avg_train_loss = tensor(1.3013, device='cuda:0')
avg_val_loss = tensor(1.4013, device='cuda:0')
epoch = 1
loss = tensor(1.4901, device='cuda:0')
rouge1 = 0.3681518923769047
rouge2 = 0.15683286277623087
rougeL = 0.23453727441540043
train_loss = tensor(1.4901, device='cuda:0')
val_loss = tensor(1.4013, device='cuda:0')

PS. I am using a modified version of the older finetune.py so it doesn't have Rouge for validation epochs.

Thanks

@amanpreet692 I moved your issue here and will reply there.
Others, I am closing this since the model is released and I don't want to spam everyone. This shouldn't discourage people making new issues!

Was this page helpful?
0 / 5 - 0 ratings