Hey,
I'm currently updating my code from Keras 1.0.8 to the latest version 2.0.6. I switched a merge-layer for the new concatenate-layer, but I'm getting an error:
The first layer in a Sequential model must get an
input_shapeorbatch_input_shapeargument.
Simplified, my code looks like this:
LSTM_1 = Sequential()
LSTM_1.add(Embedding(2000, 100, weights=[emb_1], input_length=100, mask_zero=True))
LSTM_1.add(LSTM(100, input_shape=(1000, 100)))
LSTM_2 = Sequential()
LSTM_2.add(Embedding(5000, 100, weights=[emb_2], input_length=2000, mask_zero=True))
LSTM_2.add(LSTM(100, input_shape=(2000, 100)))
LSTM_3 = Sequential()
LSTM_3.add(Embedding(3000, 100, weights=[emb_3], input_length=500, mask_zero=True))
LSTM_3.add(LSTM(100, input_shape=(500, 100)))
merged_model = Sequential()
merged_model.add(Concatenate([LSTM_1, LSTM_2, LSTM_3]))
merged_model.add(Dense(2, activation='softmax'))
merged_model.compile('adam', 'categorical_crossentropy')
merged_model.fit([X_1, X_2, X_3], y, batch_size=200, epochs=10, verbose=1)
Instead of the Concatenate layer I had the following line:
merged_model.add(Merge([LSTM_1, LSTM_2, LSTM_3], mode='concat'))
The problem is, that merged_model.summary() gives me the following with the old Merge layer and the latest Keras version:
Layer (type) Output Shape Param #
=================================================================
merge_1 (Merge) (None, 300) 0
_________________________________________________________________
dense_1 (Dense) (None, 2) 602
=================================================================
Total params: 10,943,302
Trainable params: 241,802
Non-trainable params: 10,701,500
Before I updated to the latest version, it was building the model correctly with the LSTM layers inside.
Can someone explain me what's going wrong here?
Thanks!
I used the functional API instead, which is working fine. Since there is another way, it doesn't have to work with Sequential I guess.
I do believe there should be an appropriate, easy Sequential solution. Many available packages are written in Sequential, and if you want to do something tricky with them you are screwed.
I had the same problem.
I've had the same problem. Is there any updated on fixing this?
@v1nc3nt27 Do you mind sharing the code you used to fix the problem?
Well, not much to share actually. I just used the functional API instead like this:
from keras.layers.merge import concatenate
from keras.layers import Embedding, Input
from keras.models import Model
from keras.layers.core import Dense
# First LSTM
input_1 = Input(shape=(SEQ_LENGTH,), dtype='int32')
embedding_1 = Embedding(input_dim=len(EMBEDDING_FILE), output_dim=EMBEDDING_DIM, weights=[EMBEDDING_FILE], input_length=SEQ_LENGTH, mask_zero=True, trainable=True)(input_1)
LSTM_1 = LSTM(EMBEDDING_DIM, batch_input_shape=(batch_size, SEQ_LENGTH, EMBEDDING_DIM), input_shape=(SEQ_LENGTH, EMBEDDING_DIM))(embedding_1)
# Second LSTM
input_2 = Input(shape=(SEQ_LENGTH,), dtype='int32')
embedding_2 = Embedding(input_dim=len(EMBEDDING_FILE), output_dim=EMBEDDING_DIM, weights=[EMBEDDING_FILE], input_length=SEQ_LENGTH, mask_zero=True, trainable=True)(input_2)
LSTM_2 = LSTM(EMBEDDING_DIM, batch_input_shape=(batch_size, SEQ_LENGTH, EMBEDDING_DIM), input_shape=(SEQ_LENGTH, EMBEDDING_DIM))(embedding_2)
# Merge
merged = concatenate([LSTM_1, LSTM_2])
# Dense
dense_out = Dense(no_of_classes, activation='softmax')(merged)
# build and compile model
model = Model(inputs=[input_1, input_2], outputs=[dense_out])
model.compile(optimizers.Adam(), 'kullback_leibler_divergence', metrics=['accuracy'])
# train
model.fit([X_data_1, X_data_2], y_true)
This should work.
@v1nc3nt27 Thank you very much. I'll have a go at changing "my code" (code I've borrowed some some kind developer online) to get it working.
I have the same problem. Unfortunately, my code base is much bigger, which means using the functional API means changing most of my code and at least several days of work.
I guess I will stay with the warning for now, hoping that when the legacy time is over, there will be a fix for this problem.
Most helpful comment
Well, not much to share actually. I just used the functional API instead like this:
This should work.