Keras: How to implement a conv layer with different filter sizes (Zhang & Wallace 2015) ?

Created on 9 May 2017  路  12Comments  路  Source: keras-team/keras

Hello,

I'm trying to reproduce the CNN architecture proposed in this paper, which has the following 1-CNN-layer architecture with two of each of the varying filter sizes (+ global maxpooling and dropout):
screen shot 2017-05-09 at 00 49 10

Is there a way to implement this architecture in Keras?

Best,
ben0it8

stale

Most helpful comment

Here's what I ended up doing, which appears to be doing the right thing, but I'm still new enough to Keras that I haven't figured out how to introspect this properly to make sure...

submodels = []
for kw in (3, 4, 5):    # kernel sizes
    submodel = Sequential()
    submodel.add(Embedding(len(word_index) + 1,
                           EMBEDDING_DIM,
                           weights=[embedding_matrix],
                           input_length=MAX_SEQUENCE_LENGTH,
                           trainable=False))
    submodel.add(Conv1D(FILTERS,
                        kw,
                        padding='valid',
                        activation='relu',
                        strides=1))
    submodel.add(GlobalMaxPooling1D())
    submodels.append(submodel)
big_model = Sequential()
big_model.add(Merge(submodels, mode="concat"))
big_model.add(Dense(HIDDEN_DIMS))
big_model.add(Dropout(P_DROPOUT))
big_model.add(Activation('relu'))
big_model.add(Dense(1))
big_model.add(Activation('sigmoid'))
print('Compiling model')
big_model.compile(loss='binary_crossentropy',
                  optimizer='adam',
                  metrics=['accuracy'])

All 12 comments

Apply different convolutional layers on the same input and merge their outputs?

I'm in the middle of figuring this out myself. Here's what I think is necessary.

1) You'll need to replicated your inputs across each of the input "channels" (i.e. for each filter width).
2) You're doing a "concatenate" merge after the GlobalMaxPooling1D on the Conv1D layer outputs (it looks like there are 2 "merges" happening in the diagram, but I don't believe it's necessary.

Have a look at the following for inspiration:
https://gist.github.com/ameasure/944439a04546f4c02cb9
https://statcompute.wordpress.com/2017/01/08/an-example-of-merge-layer-in-keras/

Let me know if you've made any progress, and I'll do the same.

Here's what I ended up doing, which appears to be doing the right thing, but I'm still new enough to Keras that I haven't figured out how to introspect this properly to make sure...

submodels = []
for kw in (3, 4, 5):    # kernel sizes
    submodel = Sequential()
    submodel.add(Embedding(len(word_index) + 1,
                           EMBEDDING_DIM,
                           weights=[embedding_matrix],
                           input_length=MAX_SEQUENCE_LENGTH,
                           trainable=False))
    submodel.add(Conv1D(FILTERS,
                        kw,
                        padding='valid',
                        activation='relu',
                        strides=1))
    submodel.add(GlobalMaxPooling1D())
    submodels.append(submodel)
big_model = Sequential()
big_model.add(Merge(submodels, mode="concat"))
big_model.add(Dense(HIDDEN_DIMS))
big_model.add(Dropout(P_DROPOUT))
big_model.add(Activation('relu'))
big_model.add(Dense(1))
big_model.add(Activation('sigmoid'))
print('Compiling model')
big_model.compile(loss='binary_crossentropy',
                  optimizer='adam',
                  metrics=['accuracy'])

I was trying to fit your implementation but got:
ValueError: The model expects 3 input arrays, but only received one array. Found: array with shape (48943, 300)

Any idea?

Yes, this is what I meant about "replicating the inputs"...sorry, I should have included the fit() call to clarify.

hist = big_model.fit([x_train, x_train, x_train],
                     y_train,
                     batch_size=BATCH_SIZE,
                     epochs=EPOCHS,
                     validation_data=([x_val, x_val, x_val], y_val),
                     callbacks=callbacks)

You can see...I have x_train and x_val as my training/validation inputs...because I'm using three different filter sizes, it's like the net is expecting 3 different input streams. By turning my inputs into a list of NUM_KERNEL_SIZES times the inputs, that gets handled.

thank you for sharing that!

This issue has been automatically marked as stale because it has not had recent activity. It will be closed after 30 days if no further activity occurs, but feel free to re-open a closed issue if needed.

The same problem seems to be addressed and solved in this issue using the Graph model.

This issue has been automatically marked as stale because it has not had recent activity. It will be closed after 30 days if no further activity occurs, but feel free to re-open a closed issue if needed.

Closing as this is resolved

Here's what I ended up doing, which _appears_ to be doing the right thing, but I'm still new enough to Keras that I haven't figured out how to introspect this properly to make sure...

submodels = []
for kw in (3, 4, 5):    # kernel sizes
    submodel = Sequential()
    submodel.add(Embedding(len(word_index) + 1,
                           EMBEDDING_DIM,
                           weights=[embedding_matrix],
                           input_length=MAX_SEQUENCE_LENGTH,
                           trainable=False))
    submodel.add(Conv1D(FILTERS,
                        kw,
                        padding='valid',
                        activation='relu',
                        strides=1))
    submodel.add(GlobalMaxPooling1D())
    submodels.append(submodel)
big_model = Sequential()
big_model.add(Merge(submodels, mode="concat"))
big_model.add(Dense(HIDDEN_DIMS))
big_model.add(Dropout(P_DROPOUT))
big_model.add(Activation('relu'))
big_model.add(Dense(1))
big_model.add(Activation('sigmoid'))
print('Compiling model')
big_model.compile(loss='binary_crossentropy',
                  optimizer='adam',
                  metrics=['accuracy'])

In the above code as @fmailhot mentioned. When I tried to compile it says there is no Layer named Merge().

I had the same problem with the Merge() function
I solved it by downgrading keras:
pip uninstall keras
pip install keras==2.1.2

Was this page helpful?
0 / 5 - 0 ratings