Hey guys!
Inspired by the attention LSTM,
Each time I have 3 input vectors, say x_1, x_2 and x_3.
I wish to first make a linear combination layer_I = a_1_x1+a2_x2+a3*x3
Then I merge this layer with some other sequencial layers.
I wish to learn the a_1 a_2 and a_3
How to do it in keras????
THANKS !!!!!!!!!!!!
with love
Typically the attention is done with a single MLP that maps vectors to scores, then uses those scores in a softmax to get a probability distribution over the vectors. Finally, you hadamard the probability distribution and initial vectors, sum over the sequence dimension (dimension 1).
here's some code I wrote to do this.
https://gist.github.com/braingineer/27c6f26755794f6544d83dec2dd27bbb
Though, you should definitely read up on attention further. Bahdanau et al have a great paper on it.
Have a look at this:
https://github.com/philipperemy/keras-simple-attention-mechanism
It's a very simple Hello world attention mechanism but might address your needs!
@braingineer Thanks for your code. Do you have a toy example on how to use it?
This issue has been automatically marked as stale because it has not had recent activity. It will be closed after 30 days if no further activity occurs, but feel free to re-open a closed issue if needed.
@briangineer does your code include bahdanu's attention
Most helpful comment
Typically the attention is done with a single MLP that maps vectors to scores, then uses those scores in a softmax to get a probability distribution over the vectors. Finally, you hadamard the probability distribution and initial vectors, sum over the sequence dimension (dimension 1).
here's some code I wrote to do this.
https://gist.github.com/braingineer/27c6f26755794f6544d83dec2dd27bbb
Though, you should definitely read up on attention further. Bahdanau et al have a great paper on it.