Addons: Request for examples: Seq2seq

Created on 6 Jul 2019  路  16Comments  路  Source: tensorflow/addons

I believe there are many seq2seq examples already published, but it makes sense that we at least have one or two in our examples section for user experience.

Related:
https://google.github.io/seq2seq/
https://github.com/tensorflow/addons/blob/master/examples/README.md

good first issue help wanted seq2seq tutorials

Most helpful comment

Yeah sure.. I would love to solve this.

All 16 comments

I can pick this up if it is ok.

Some ideas to start the example:

  • For machine translation:

    • On toy English-to-German dataset (I think German-to-English has better user experience since we can demonstrate the outputs better) from http://opennmt.net/OpenNMT-tf/quickstart.html which contains 10k tokenized sentences.

    • Trivial seq2seq LSTM architecture with Attention Mechanism

    • Use beam decoding to generate outputs

    • Maybe BLEU calculation

I can pick this up if it is ok.

Some ideas to start the example:

  • For machine translation:

    • On toy English-to-German dataset (I think German-to-English has better user experience since we can demonstrate the outputs better) from http://opennmt.net/OpenNMT-tf/quickstart.html which contains 10k tokenized sentences.
    • Trivial seq2seq LSTM architecture with Attention Mechanism
    • Use beam decoding to generate outputs
    • Maybe BLEU calculation

That'd be great! Thanks and welcome to Addons!

Hi @seanpmorgan @kazemnejad , Is it still an _Open Issue_?
Asking because I am interested to work on it.

Hi @PyExtreme, Thank you for showing your interest.
Yes, actually I'm currently working on it. I was waiting for the #375 to be fixed and it got fixed 3 days ago in #503. I think I can submit a PR in the next few days. Thus you can work on that PR if you want.

+1 any progress on this?
I'd appreciate an example of how to create and train a keras decoder with attention. I can't figure out how I am supposed to set up AttentionWrapper in a model without yet knowing the memory tensor

Hi @Mainak431, Sorry for the inconvenience.

Actually, the draft of this example is present at my fork. However, in the default Keras training mode (graph mode) + tf.data.dataset (where the default mode is eager), there is a bug related to caching of tensor dimension which I guess is from the AttentionWrapper. Unfortunately, I'm busy with my university's works at this moment and I could not work on that bug, so I would be happy if someone could find the source of that bug and help the progress.

thanks for sharing @kazemnejad! Where do you get the bug? I was getting an error at some point from the rnn cell being wrapped by AttentionWrapper that it was being passed a rank 1 tensor when it was expecting a rank 2

Thanks @kazemnejad for submitting the fix. Based on his contribution, I have successfully get the seq2seq code running for a semantic parsing task.

For anyone who is interested, the notebook is here

+1 Any progress on this?

I'd really appreciate if someone could provide an example on how to build seq2seq NMT model with attention and beam search wrappers. I couldn't find any examples.

cc @Om-Pandey to see if this is an issue you would like to be assigned.

Yeah sure.. I would love to solve this.

Yeah sure.. I would love to solve this.

Great let us know if you have any questions regarding adding a tutorial!

@Om-Pandey I am glad that you accepted this. I will be more than happy to give more information about the bugs that I was referring to. Please feel free to contact me if it is needed.

@kazemnejad thank you so much... help is much needed 馃槄. @seanpmorgan based on earlier conversations in this thread, just wanted to clarify something, wouldn't it be better if we included the trivial seq-2-seq NMT tutorial and gave separate methods and necessary explanation for attention modeling and beam/lexicon search which the reader can include if needed, rather than hard coding it into the structure in one workflow ?

@kazemnejad thank you so much... help is much needed . @seanpmorgan based on earlier conversations in this thread, just wanted to clarify something, wouldn't it be better if we included the trivial seq-2-seq NMT tutorial and gave separate methods and necessary explanation for attention modeling and beam/lexicon search which the reader can include if needed, rather than hard coding it into the structure in one workflow ?

Yes, we're more than willing to have several different seq2seq tutorials

Hey @seanpmorgan , please check #806 and merge, Thanks!

Was this page helpful?
0 / 5 - 0 ratings

Related issues

shun-lin picture shun-lin  路  4Comments

seanpmorgan picture seanpmorgan  路  4Comments

maziyarpanahi picture maziyarpanahi  路  3Comments

seanpmorgan picture seanpmorgan  路  3Comments

iskorini picture iskorini  路  4Comments