I want to get a word-alignment list in nmt system, but I don't know how to do this. Can this framework achieve my goal?
See the hello_t2t IPython notebook at Google Colab and in the "Display Attention" section choose "Input - Output" instead of "All" from the menu.
This will visualize the encoder-decoder attention, which is the closest thing to word alignment you can get out of the box (as far as I know).
As you can see, this is very different from the word alignment you expected:
Of course, you could train a new T2T model in the "old NMT way": restrict the number of enc-dec attention heads to 1, use words (with UNK) instead of subwords,... and get much worse MT quality, but with a better interpretable enc-dec attention.
Using the same visualizer and the ipython notebook I get really bizarre attentions.
What could be the issue?

Most helpful comment
See the hello_t2t IPython notebook at Google Colab and in the "Display Attention" section choose "Input - Output" instead of "All" from the menu.
This will visualize the encoder-decoder attention, which is the closest thing to word alignment you can get out of the box (as far as I know).
As you can see, this is very different from the word alignment you expected:
Of course, you could train a new T2T model in the "old NMT way": restrict the number of enc-dec attention heads to 1, use words (with UNK) instead of subwords,... and get much worse MT quality, but with a better interpretable enc-dec attention.