Fairseq: Implementation of Self-Attention with Relative Position Representations

Created on 5 Mar 2019  路  4Comments  路  Source: pytorch/fairseq

Could you please implement the Self-Attention with Relative Position Representations

It was done in tensor2tensor.

Relative position representations outperforms the origin Transformer by about 1 BLEU.

Thanks

Most helpful comment

@myleott @alexeib @gxzks could you share the implementations of the relative positional embeddings? It may be useful for other scenarios, even if they have not been thoroughly tested in the last fairseq version.

All 4 comments

This is currently not on our roadmap. We welcome any contributions by pull request! Note that we have some models trained in fairseq that have quite strong BLEU results. For example, this paper: https://openreview.net/pdf?id=SkVhlh09tX

models linked here: https://github.com/pytorch/fairseq/tree/master/examples/pay_less_attention_paper

FWIW, we actually experimented internally with relative positional embeddings in fairseq and found them to be only marginally better and quite a bit slower, so we never pushed it public. cc @alexeib: any interest in pushing your branch public?

Thanks for your reply :)
I've implemented the relative positional embeddings roughly just as a mimic of what T2T did but got no better performance. In T2T it actually performed better by about 0.5 BLEU scores in my own dataset. The training speed was slower too but I think it's normal as additional computation will happen.
I think the relative positional embeddings may be just an additional part in transformer not a new architecture, since it only changes the dot product function to plus an additional embedding and sets no absolute positional embeddings. Maybe can we set flags to control whether to use it or not?

@myleott @alexeib @gxzks could you share the implementations of the relative positional embeddings? It may be useful for other scenarios, even if they have not been thoroughly tested in the last fairseq version.

Was this page helpful?
0 / 5 - 0 ratings