Keras: Add Fixed Features With Word Embeddings

Created on 4 Apr 2016 · 14Comments · Source: keras-team/keras

I am wondering if it is possible to add fixed features to an RNN with word embedding in Keras? I have two inputs which share the same word embedding (and this model works) and I would like to see if it is possible to mix in fixed features which describe further the contexts of the example. The use case is a query and a returned title page as the word embeddings and the fixed features are things like the length of each, various similarity measures of each etc).

model_query = Sequential()  

    #creates a matrix that is 30,046 x 400  (number of distinct words x 400)
model_query.add(Embedding(output_dim=dimsize,
                        input_dim=n_symbols,
                        mask_zero=True,
                        weights=[embedding_weights],
                        input_length=input_length))  

model_title = Sequential()  # or Graph or whatever
model_title.add(Embedding(output_dim=dimsize,
                        input_dim=n_symbols,
                        mask_zero=True,
                        weights=[embedding_weights],
                        input_length=input_length))  


model_features=Sequential()
model_features.add(????)  **  #How to add other features??????**

model = Sequential()
model.add(Merge([model_query, model_title], mode='concat'))  #Add them here????

model.add(LSTM(400))
model.add(Dropout(0.5))

model.add(Dense(1))
print ("Compiling Model....")
model.compile(loss='mean_squared_error', optimizer='rmsprop')

stale

Source

BrianMiner

👍2

Most helpful comment

You should do the merge after the LSTM layer. Since your auxiliary features are not sequences, they should not be processed as sequences. So there will not be any dimension problem.

If you want to do the merge before the LSTM layer, you will need to use a RepeatVector layer to turn your features into (constant) sequences.

fchollet on 5 Apr 2016

👍3

All 14 comments

You probably want to embed your fixed features in some way, via a Dense layer for instance. If not, you could insert them into your model via an Activation('linear') layer, which is a layer that does nothing.

Yes, Merge with concat would be the right way to do this.

fchollet on 5 Apr 2016

👍3

@fchollet would I add a dense layer (somehow, I am not sure yet how) and then include this later in the merge (along with the others two that are there now)? Any tips? Seems like I need the dense layer to have the same dimensions as the embeddings - and that is beyond my meager understanding.

BrianMiner on 5 Apr 2016

You should do the merge after the LSTM layer. Since your auxiliary features are not sequences, they should not be processed as sequences. So there will not be any dimension problem.

If you want to do the merge before the LSTM layer, you will need to use a RepeatVector layer to turn your features into (constant) sequences.

fchollet on 5 Apr 2016

👍3

Something like this? I have so much to learn. Thanks for this great library!

model_query = Sequential()  

 #creates a matrix that is 30,046 x 400  (number of distinct words x 400)
 model_query.add(Embedding(output_dim=dimsize,
                        input_dim=n_symbols,
                        mask_zero=True,
                        weights=[embedding_weights],
                        input_length=input_length))  # Adding Input Length

 model_title = Sequential()  # or Graph or whatever
 model_title.add(Embedding(output_dim=dimsize,
                        input_dim=n_symbols,
                        mask_zero=True,
                        weights=[embedding_weights],
                        input_length=input_length))  # Adding Input Length


  model_features=Sequential()
  model_features.add(Dense(output_dim=400,input_dim=100))   


  model_embed = Sequential()
  model_embed.add(Merge([model_query, model_title], mode='concat'))
  model_embed.add(LSTM(400))


  model_final = Sequential() 
  model_final.add(Merge([model_embed,model_features], mode='concat'))
  model_final.add(Dropout(0.5))
    model_final.add(Dense(1))

    print ("Compiling Model....")
    model_final.compile(loss='mean_squared_error', optimizer='rmsprop')

BrianMiner on 5 Apr 2016

👍1

What about using the Graph container? I use something along these lines:

model = Graph()

model.add_input(name='input', input_shape=(maxlen,), dtype=int)
model.add_node(Embedding(max_features, feature_dimensions, input_length=maxlen, dropout=0.4),name='embedding',input='input')
model.add_node(recurrent_cell(lstm_dim, dropout_W=0.5, dropout_U=0.1), name='forward', input='embedding')

model.add_input(name='input_titles',input_shape=(20,),dtype=int)
model.add_node(Embedding(max_features, feature_dimensions, input_length=20, dropout=0.4),name='embedding_titles',input='input_titles')
model.add_node(recurrent_cell(lstm_dim, dropout_W=0.5, dropout_U=0.1), name='title_lstm', input='embedding_titles')

model.add_node(Dropout(0.5),name='dropout',inputs=['forward','title_lstm'])

Froskekongen on 5 Apr 2016

Hi, @BrianMiner, have you solve your question? I meet the similar question, but I am not sure. Do you plan to connect second word vector to first word vector behind it, for example, one sentence has two word vectors [[1,2],[3,4]] concat second sentence [[5,6],[7,8]] --> [[1,2,5,6],[3,4,7,8]]? Or the output is [[1,2],[3,4],[5,6],[7,8]]?

Imorton-zd on 11 Apr 2016

@Imorton-zd my question was not about using two word vectors- that part I had down with the merge layer in my example above. My question was how to then also incorporate a non sequence vector (which has context and demographic information). That I did so far using the code snippet above "model_final". Results were not any better using this extra information so I am not 100% sure it is correct.

BrianMiner on 11 Apr 2016

👍1

@BrianMiner If I want to implement the idea, connecting second word vector to first word vector behind it, in the first layer, like my example above: [[1,2],[3,4]] and [[5,6],[7,8]] --> [[1,2,5,6],[3,4,7,8]], how can I use merge method? Any opinions would be appreciated!

Imorton-zd on 11 Apr 2016

@Imorton-zd are you simply looking to concatenate two word embedding layers? If so, the code above does that. At that point, you can connect a recurrent layer or flatten and add a dense layer.

model_query = Sequential()

#creates a matrix that is 30,046 x 400 (number of distinct words x 400)
model_query.add(Embedding(output_dim=dimsize,
input_dim=n_symbols,
mask_zero=True,
weights=[embedding_weights],
input_length=input_length)) # Adding Input Length

model_title = Sequential() # or Graph or whatever
model_title.add(Embedding(output_dim=dimsize,
input_dim=n_symbols,
mask_zero=True,
weights=[embedding_weights],
input_length=input_length)) # Adding Input Length

model_features=Sequential()
model_features.add(Dense(output_dim=400,input_dim=100))

model_embed = Sequential()
model_embed.add(Merge([model_query, model_title], mode='concat'))

BrianMiner on 11 Apr 2016

@BrianMiner You said the model with the non-seq predictors were not better. Have you played with those features, e.g. do transform, add more dense layers on top, etc. This seems like a very important extension to a standard RNN model. I want to do exactly the same thing for a non-NLP problem, but wonder if there's any trick involved.

htso on 15 Jul 2017

Hi,
i know this is an old post but i hope someone can answer.I am trying to do something similar to what @BrianMiner had already done.I also believe that i need to merge two embedding layers but i am a little bit confused.First embeddings are pre-trained word embeddings (300d) (e= Embedding(vocab_size, 300, weights=[embedding_matrix], input_length=23, trainable=False)) so the input data is integerized and embedding matrix contains embedding for each unique word. The second embedding layer i want is for the POStags (i can maybe use binary vector representation for each tag lets suppose 50d) but i am not sure how to do it? How would the embedding layer looklike and would i need to pass something in model.fit(padded_docs, labels, epochs=50, verbose=0) where for now i am passing integerized data and labels.Many thanks.

Ayedah on 10 May 2018

👍1

@Ayedah did you figure out a solution to this problem?

opyr on 21 May 2018

@Ayedah or @BrianMiner did they find any solution?
I'm also trying to use embeddings with non-sequence features, but the results are bad.

Thank you

MarcosGrzeca on 9 Sep 2018

Hi,
I wasn't using non sequence features..I was using sequence of POS tags for
each word along with embeddings..hope it clarifies

On Sun, Sep 9, 2018, 20:14 Marcos Grzeça notifications@github.com wrote:

@Ayedah https://github.com/Ayedah or @BrianMiner
https://github.com/BrianMiner did they find any solution?
I'm also trying to use embeddings with non-sequence features, but the
results are bad.

Thank you

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/keras-team/keras/issues/2190#issuecomment-419730389,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AWM6RQ4BWQbJ5nuzi4SSB51YqAYs7LCkks5uZUx4gaJpZM4H_hLG
.