Transformers: Seq2Seq model with HugginFace

Created on 12 Oct 2019 · 25Comments · Source: huggingface/transformers

Hi
I am looking for a Seq2Seq model which is based on HuggingFace BERT model, I know fairseq has some implementation, but they are generally to me not very clean or easy to use, and I am looking for some good implementation based on HuggingFace work, thanks a lot for your help

Source

juliahane

👍4

All 25 comments

Hey @juliahane, glad you’re asking: I am currently working on this (See PR #1455) Stay tuned! Closing this as it is not an issue per se.

rlouf on 13 Oct 2019

Hi Remi
thanks a lot for the great work, since I need it for a deadline approaching
very soon, I would really appreciate
if you may know approximately when could be possible to use?
thanks a lot again for your efforts.
Best regards
Julia

On Sun, Oct 13, 2019 at 7:29 PM Rémi Louf notifications@github.com wrote:

Closed #1506 https://github.com/huggingface/transformers/issues/1506.

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/huggingface/transformers/issues/1506?email_source=notifications&email_token=AM3GZM3GKOHDP6CEUFCVFY3QONLGJA5CNFSM4JACXTZKYY3PNVWWK3TUL52HS4DFWZEXG43VMVCXMZLOORHG65DJMZUWGYLUNFXW5KTDN5WW2ZLOORPWSZGOUF2MLRY#event-2708784583,
or unsubscribe
https://github.com/notifications/unsubscribe-auth/AM3GZM5SYZMX4IAXP4YJNUDQONLGJANCNFSM4JACXTZA
.

juliahane on 15 Oct 2019

Hi, probably not in time for your deadline. We are expecting a first working version in a few weeks.

thomwolf on 15 Oct 2019

Hi Thomas,
I really need to make this code working for a deadline, I really appreciate to point me to the current existing implementations you may be aware of which I could use for now, thank you so much for your help

juliahane on 18 Oct 2019

@thomwolf , I see you have run_lm_finetuning.py script, can I use this script for seq2seq generation task? Does it work for this purpose? thanks

juliahane on 20 Oct 2019

Hi @juliahane, no you cannot use run_lm_finetuning for seq2seq generation.

If you cannot wait, I think this repo is a good place to start. It's based on our library and specifically target seq2seq for summarization: https://github.com/nlpyang/PreSumm

thomwolf on 21 Oct 2019

👍1

Let's keep this issue open to gather all threads asking about seq2seq in the repo.

thomwolf on 21 Oct 2019

Hi Thomas,
I really need to make this code working for a deadline, I really appreciate to point me to the current existing implementations you may be aware of which I could use for now, thank you so much for your help

You can have a look at PR #1455 . What you're looking for is in the modeling_seq2seq.py and run_seq2seq_finetuning.py scripts. This only works for Bert at the moment.

rlouf on 22 Oct 2019

Hi
thanks a lot for the response, I cannot see the files, I really
appreciate sharing the files with me, thanks

On Tue, Oct 22, 2019 at 9:21 PM Rémi Louf notifications@github.com wrote:

Hi Thomas,
I really need to make this code working for a deadline, I really
appreciate to point me to the current existing implementations you may be
aware of which I could use for now, thank you so much for your help

You can have a look at PR #1455
https://github.com/huggingface/transformers/pull/1455 . What you're
looking for is in the modeling_seq2seq.py and run_seq2seq_finetuning.py
scripts. This only works for Bert at the moment.

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/huggingface/transformers/issues/1506?email_source=notifications&email_token=AM3GZM6WIWPHCKAUXPHXTM3QP5HCDA5CNFSM4JACXTZKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEB64OFI#issuecomment-545113877,
or unsubscribe
https://github.com/notifications/unsubscribe-auth/AM3GZMZ4LJJNLRIT6LOL5GDQP5HCDANCNFSM4JACXTZA
.

juliahane on 23 Oct 2019

BERT is sufficient for me, I really appreciate sharing the files and
telling me the commands to run them, thanks

On Wed, Oct 23, 2019 at 11:25 PM julia hane juliahane123@gmail.com wrote:

Hi
thanks a lot for the response, I cannot see the files, I really
appreciate sharing the files with me, thanks

On Tue, Oct 22, 2019 at 9:21 PM Rémi Louf notifications@github.com
wrote:

Hi Thomas,
I really need to make this code working for a deadline, I really
appreciate to point me to the current existing implementations you may be
aware of which I could use for now, thank you so much for your help

You can have a look at PR #1455
https://github.com/huggingface/transformers/pull/1455 . What you're
looking for is in the modeling_seq2seq.py and run_seq2seq_finetuning.py
scripts. This only works for Bert at the moment.

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/huggingface/transformers/issues/1506?email_source=notifications&email_token=AM3GZM6WIWPHCKAUXPHXTM3QP5HCDA5CNFSM4JACXTZKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEB64OFI#issuecomment-545113877,
or unsubscribe
https://github.com/notifications/unsubscribe-auth/AM3GZMZ4LJJNLRIT6LOL5GDQP5HCDANCNFSM4JACXTZA
.

juliahane on 23 Oct 2019

Hi Remi
I really appreciate providing me with the command that I could get this
pull request in my installed huggingface library, thanks
Best
Julia

On Tue, Oct 22, 2019 at 9:21 PM Rémi Louf notifications@github.com wrote:

Hi Thomas,
I really need to make this code working for a deadline, I really
appreciate to point me to the current existing implementations you may be
aware of which I could use for now, thank you so much for your help

You can have a look at PR #1455
https://github.com/huggingface/transformers/pull/1455 . What you're
looking for is in the modeling_seq2seq.py and run_seq2seq_finetuning.py
scripts. This only works for Bert at the moment.

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/huggingface/transformers/issues/1506?email_source=notifications&email_token=AM3GZM6WIWPHCKAUXPHXTM3QP5HCDA5CNFSM4JACXTZKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEB64OFI#issuecomment-545113877,
or unsubscribe
https://github.com/notifications/unsubscribe-auth/AM3GZMZ4LJJNLRIT6LOL5GDQP5HCDANCNFSM4JACXTZA
.

juliahane on 25 Oct 2019

git checkout --track origin/conditional-generation

Should work if you cloned the original repository.

However I am afraid we cannot provide support for work that has not made its way into the library yet as the interface is very likely to change.

rlouf on 25 Oct 2019

Hi Remi
I was trying to run the bert seq2seq based codes, It gots a lot of errors,
I really appreciate if you could run it, and
making sure BERT one works, thanks a lot

On Tue, Oct 22, 2019 at 9:21 PM Rémi Louf notifications@github.com wrote:

Hi Thomas,
I really need to make this code working for a deadline, I really
appreciate to point me to the current existing implementations you may be
aware of which I could use for now, thank you so much for your help

You can have a look at PR #1455
https://github.com/huggingface/transformers/pull/1455 . What you're
looking for is in the modeling_seq2seq.py and run_seq2seq_finetuning.py
scripts. This only works for Bert at the moment.

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/huggingface/transformers/issues/1506?email_source=notifications&email_token=AM3GZM6WIWPHCKAUXPHXTM3QP5HCDA5CNFSM4JACXTZKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEB64OFI#issuecomment-545113877,
or unsubscribe
https://github.com/notifications/unsubscribe-auth/AM3GZMZ4LJJNLRIT6LOL5GDQP5HCDANCNFSM4JACXTZA
.

juliahane on 25 Oct 2019

Hi Remi
Sure, I understand you cannot provide support for ongoing work, I anyway
have a deadline and will need to use it,
could you tell me please just how much this code is tested? Does it work
for BERT? what I saw the code had
several bugs in the optimizer part and does not run, I really appreciate if
you could just tell me how much this
code is tested
thanks

On Fri, Oct 25, 2019 at 12:15 PM Rémi Louf notifications@github.com wrote:

https://stackoverflow.com/questions/9537392/git-fetch-remote-branch

The name of the branch is conditional-generation. However I am afraid we
cannot provide support for ongoing work that has not made its way into the
library yet.

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/huggingface/transformers/issues/1506?email_source=notifications&email_token=AM3GZM77Y7LXZSDFIIOE26TQQLBL7A5CNFSM4JACXTZKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOECH4WBQ#issuecomment-546294534,
or unsubscribe
https://github.com/notifications/unsubscribe-auth/AM3GZM5CMVSXO56YSYEYPILQQLBL7ANCNFSM4JACXTZA
.

juliahane on 25 Oct 2019

Hi Remi
I made this work, could you please tell me how can I get the generated
sequence from decoder please?
thanks

juliahane on 27 Oct 2019

Hi Thomas
Remi was saying in PR:#1455 it has the bert seq2seq ready, could you move in a gradual way please? So merging the codes for BERT already so people can use the BERT one, this is already great, then after a while when this is ready for also other encoders, add them later, I really appreciate adding the BERT ones thanks

juliahane on 27 Oct 2019

1455 was merged and it is now possible to define and train encoder-decoder models. Only Bert is supported at the moment.

rlouf on 30 Oct 2019

Hi Remi and thomas
Thank you so much for the great help, this is awesome, and I would like to
really appreciate your hard work,
Best regards
Julia

On Wed, Oct 30, 2019 at 5:47 PM Rémi Louf notifications@github.com wrote:

1455 https://github.com/huggingface/transformers/pull/1455 was merged

and it is now possible to define and train encoder-decoder models. Only
Bert is supported at the moment.

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/huggingface/transformers/issues/1506?email_source=notifications&email_token=AM3GZM6IFGCLINEDGL6ELWLQRG3CLA5CNFSM4JACXTZKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOECU5W6A#issuecomment-548002680,
or unsubscribe
https://github.com/notifications/unsubscribe-auth/AM3GZMY5O7QGB55RCHUSCP3QRG3CLANCNFSM4JACXTZA
.

juliahane on 30 Oct 2019

Hi
I was wondering if you could give some explanations how the decoder part
work, I see this is a masked language model head BERT,
used as decoder, I think masked language model head bert mask some part and
predict specific masked tokens,
I am not sure how this work as a generation module, thanks for clarifying.

On Wed, Oct 30, 2019 at 8:47 PM julia hane juliahane123@gmail.com wrote:

Hi Remi and thomas
Thank you so much for the great help, this is awesome, and I would like to
really appreciate your hard work,
Best regards
Julia

On Wed, Oct 30, 2019 at 5:47 PM Rémi Louf notifications@github.com
wrote:

1455 https://github.com/huggingface/transformers/pull/1455 was merged

and it is now possible to define and train encoder-decoder models. Only
Bert is supported at the moment.

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/huggingface/transformers/issues/1506?email_source=notifications&email_token=AM3GZM6IFGCLINEDGL6ELWLQRG3CLA5CNFSM4JACXTZKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOECU5W6A#issuecomment-548002680,
or unsubscribe
https://github.com/notifications/unsubscribe-auth/AM3GZMY5O7QGB55RCHUSCP3QRG3CLANCNFSM4JACXTZA
.

juliahane on 31 Oct 2019

1455 was merged and it is now possible to define and train encoder-decoder models. Only Bert is supported at the moment.

Hi Remi, I posted some bugs/suggestions about this code at #1674, thanks

rabeehk on 1 Nov 2019

👍1

Hi
when I run this code I got this erorr, thanks for help.

File "/user/julia/dev/temp/transformers/examples/utils_summarization.py",
line 143, in encode_for_summarization
for line in story_lines
File "/user/julia/dev/temp/transformers/examples/utils_summarization.py",
line 143, in
for line in story_lines
AttributeError: 'BertTokenizer' object has no attribute
'add_special_tokens_single_sequence'

On Fri, Nov 1, 2019 at 12:08 PM Rabeeh Karimi Mahabadi <
[email protected]> wrote:

1455 https://github.com/huggingface/transformers/pull/1455 was merged

and it is now possible to define and train encoder-decoder models. Only
Bert is supported at the moment.

Hi Remi, I posted some bugs/suggestions about this code at #1674
https://github.com/huggingface/transformers/issues/1674, thanks

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/huggingface/transformers/issues/1506?email_source=notifications&email_token=AM3GZM6BVCAVV3DJCFEBAJ3QRQE3PA5CNFSM4JACXTZKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEC2UPNQ#issuecomment-548751286,
or unsubscribe
https://github.com/notifications/unsubscribe-auth/AM3GZMYWFHZTBG7CP2BGOR3QRQE3PANCNFSM4JACXTZA
.

juliahane on 1 Nov 2019

Hi
can you please also add a way to see the generated sequences? thanks

On Fri, Nov 1, 2019 at 3:19 PM julia hane juliahane123@gmail.com wrote:

Hi
when I run this code I got this erorr, thanks for help.

File "/user/julia/dev/temp/transformers/examples/utils_summarization.py",
line 143, in encode_for_summarization
for line in story_lines
File
"/user/julia/dev/temp/transformers/examples/utils_summarization.py", line
143, in
for line in story_lines
AttributeError: 'BertTokenizer' object has no attribute
'add_special_tokens_single_sequence'

On Fri, Nov 1, 2019 at 12:08 PM Rabeeh Karimi Mahabadi <
[email protected]> wrote:

1455 https://github.com/huggingface/transformers/pull/1455 was merged

and it is now possible to define and train encoder-decoder models. Only
Bert is supported at the moment.

Hi Remi, I posted some bugs/suggestions about this code at #1674
https://github.com/huggingface/transformers/issues/1674, thanks

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/huggingface/transformers/issues/1506?email_source=notifications&email_token=AM3GZM6BVCAVV3DJCFEBAJ3QRQE3PA5CNFSM4JACXTZKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEC2UPNQ#issuecomment-548751286,
or unsubscribe
https://github.com/notifications/unsubscribe-auth/AM3GZMYWFHZTBG7CP2BGOR3QRQE3PANCNFSM4JACXTZA
.

juliahane on 2 Nov 2019

If both your source are target belong to same language (summarization etc.):

Well, with a next word prediction language model like GPT2, you can just create a dataset like "source [SEP] target" and the run the LM (run_lm_finetuning.py) on it. During test time, you can provide "source [SEP]" as your prompt and you will get "target" as your prediction.

One small thing that you can do is mask your source tokens in the loss computation because you don't want to predict the source tokens as well! This will give you better performance and results.

This is not much different that Seq2Seq I believe. You are sharing the same parameters for source and target.

rajarsheem on 12 Jan 2020

1455 was merged and it is now possible to define and train encoder-decoder models. Only Bert is supported at the moment.

could you tell me how to get the two file modeling_seq2seq.py and run_seq2seq_finetuning.py,
so l could fine tune seq2seq model with pretrained encode model like bert?