Keras: attention trick implementation

Created on 27 Apr 2016 · 5Comments · Source: keras-team/keras

Hey guys!
Inspired by the attention LSTM,

Each time I have 3 input vectors, say x_1, x_2 and x_3.

I wish to first make a linear combination layer_I = a_1_x1+a2_x2+a3*x3

Then I merge this layer with some other sequencial layers.

I wish to learn the a_1 a_2 and a_3

How to do it in keras????

THANKS !!!!!!!!!!!!
with love

stale

Source

kingfengji

Most helpful comment

Typically the attention is done with a single MLP that maps vectors to scores, then uses those scores in a softmax to get a probability distribution over the vectors. Finally, you hadamard the probability distribution and initial vectors, sum over the sequence dimension (dimension 1).

here's some code I wrote to do this.
https://gist.github.com/braingineer/27c6f26755794f6544d83dec2dd27bbb

Though, you should definitely read up on attention further. Bahdanau et al have a great paper on it.

braingineer on 28 Apr 2016

👍2

All 5 comments

here's some code I wrote to do this.
https://gist.github.com/braingineer/27c6f26755794f6544d83dec2dd27bbb

Though, you should definitely read up on attention further. Bahdanau et al have a great paper on it.

braingineer on 28 Apr 2016

👍2

Have a look at this:

https://github.com/philipperemy/keras-simple-attention-mechanism

It's a very simple Hello world attention mechanism but might address your needs!

philipperemy on 23 May 2017

👍1

@braingineer Thanks for your code. Do you have a toy example on how to use it?

v1nc3nt27 on 17 Aug 2017

This issue has been automatically marked as stale because it has not had recent activity. It will be closed after 30 days if no further activity occurs, but feel free to re-open a closed issue if needed.