For us to run the pretrained models here, would we need to use the same dictionary and BPE codes as was used for the pre-trained model? Or does it not matter? If it matters, can you provide the dictionaries?
The German one has a prepare script given, so maybe it generates the same dictionary, but other ones (such as the Chinese one) don't have a prepare script, so it's hard to reproduce the same dictionary.
That paper used the standard preprocessed datasets provided by fairseq. You can follow the instructions to generate them: https://github.com/pytorch/fairseq/blob/master/examples/translation/README.md
You're right that the Zh-En scripts are missing.
@myleott We need the (missing) BPECode TOGETHER with the Zh-En scripts to MAKE USE OF THE provided pre-trained model
Yep, I can add them later today, thanks for pointing this out.
The BPE codes are now available in a new set of archives with a .tar.gz extension. I've also updated the README with a bunch of additional usage instructions via torch.hub: https://github.com/pytorch/fairseq/tree/master/examples/pay_less_attention_paper#example-usage-torchhub
Most helpful comment
@myleott We need the (missing) BPECode TOGETHER with the Zh-En scripts to MAKE USE OF THE provided pre-trained model