System information
Describe the feature and the current behavior/state.
This paper introduces a structured memory which can be easily integrated into a neural
network. The memory is very large by design and therefore significantly increases the capacity of the architecture, by up to a billion parameters with a negligible computational overhead.
Its design and access pattern is based on product keys, which enable fast and exact nearest
neighbor search. The ability to increase the number of parameters while keeping the same
computational budget lets the overall system strike a better trade-off between prediction accuracy and computation efficiency both at training and test time. This memory layer allows us to tackle very large scale language modeling tasks. In our experiments we consider a dataset with up to 30 billion words, and we plug our memory layer in a state-of-the-art transformer-based
architecture. In particular, we found that a memory augmented model with only 12 layers
outperforms a baseline transformer model with 24 layers, while being twice faster at inference
time. We release our code for reproducibility purposes.
https://arxiv.org/pdf/1907.05242v1.pdf
https://github.com/facebookresearch/XLM/blob/master/src/model/memory/memory.py
Will this change the current api? How?
yeah, new layer with lots of memory for the model
Who will benefit with this feature?
people who use TFA + Keras api
Any Other info.
i like pie
Hi @bionicles this looks like a really interesting paper/concept. It seems like it could be a fit as a TFA Layer.
You mentioned you may be interested in contributing what would that be dependent upon?
Just time, and my ability to understand the paper/code...
This one also looks good: https://arxiv.org/pdf/1907.09720v1.pdf
I鈥檓 definitely interested to contribute to TFA! Maybe some simpler stuff we already have working would be better in the short term
I would wanna try understanding the paper and try implementing this layer if no one is working on this issue.
Any update on this?
Sorry I was busy with some projects and could not finish the work on this. If you are looking to contribute to this go forward, it would be really helpful :D . If not I might try to pull out some time and look back to implementing it.
Likewise, I can鈥檛 do this now, but it would be cool!
Hi, @bionicles @sayoojbk @Squadrick I want to take up this issue if it's okay. Thanks
Hi, @bionicles @sayoojbk @Squadrick I want to take up this issue if it's okay. Thanks
Sure @gaurav-singh1998 you can move forward with this. If need any help ping anyone of us on gitter.
Hello @sayoojbk as I am new in this repository I may take some time to get acquainted with the code base and finally come up with a PR. Is it okay?
Yea take your time! If need any help ping on the official gitter channel for SIG-addons :P
Most helpful comment
I would wanna try understanding the paper and try implementing this layer if no one is working on this issue.