Keras: How can i fine tune the last layer using Keras?

Created on 6 May 2016 · 12Comments · Source: keras-team/keras

Hi everyone, I checked online about fine tuning the last layer in Keras. and it seems that using model.layers.pop() first, and then add the my desired output layer is a simple way.

However,
i tried this,
.................................................
model.add(Dense(1000, activation='softmax', name='fc8'))
model.layers.pop()
model.add(Dense(200, activation='softmax', name='fc8'))
model.compile(loss='sparse_categorical_crossentropy', optimizer='sgd', metrics=['accuracy'])

But the error came out: Exception: The name "fc8" is used 2 times in the model. All layer names should be unique.

But I print model.summary() after i pop the last layer, the fc-8 layer indeed poped out already. Why the name is still used 2 times in the model.

thanks a lot for any suggestions.

BTW, is there any documentation for the .pop() function? Thanks.

Source

bryanyzhu

Most helpful comment

The PR is made to solve this kind of problem.
model.layers.pop() does not maintain model.output well and this works for me.

def pop_layer(model):
    if not model.outputs:
        raise Exception('Sequential model cannot be popped: model is empty.')

    model.layers.pop()
    if not model.layers:
        model.outputs = []
        model.inbound_nodes = []
        model.outbound_nodes = []
    else:
        model.layers[-1].outbound_nodes = []
        model.outputs = [model.layers[-1].output]
    model.built = False

model = Sequential()
model.add(Dense(100, activation='softmax', name='fc1', input_shape=(128,)))
model.add(Dense(100, activation='softmax', name='fc2'))
pop_layer(model)
model.add(Dense(20, activation='softmax', name='fc2'))
model.compile(loss='sparse_categorical_crossentropy', optimizer='sgd', metrics=['accuracy'])
model.summary()

joelthchao on 6 May 2016

👍23 ❤5 😄3 🎉2

All 12 comments

You can refer to #2418 and #2371.

joelthchao on 6 May 2016

@joelthchao Thanks for your quick reply. I indeed looked into these two issues. The situation is, I can pop the last layer, and i can add the new dense layer. They all run as expected.

But the error raise when i compile the model. I don't understand why 'fc8' this name is still being used. Thanks.

bryanyzhu on 6 May 2016

The PR is made to solve this kind of problem.
model.layers.pop() does not maintain model.output well and this works for me.

def pop_layer(model):
    if not model.outputs:
        raise Exception('Sequential model cannot be popped: model is empty.')

    model.layers.pop()
    if not model.layers:
        model.outputs = []
        model.inbound_nodes = []
        model.outbound_nodes = []
    else:
        model.layers[-1].outbound_nodes = []
        model.outputs = [model.layers[-1].output]
    model.built = False

model = Sequential()
model.add(Dense(100, activation='softmax', name='fc1', input_shape=(128,)))
model.add(Dense(100, activation='softmax', name='fc2'))
pop_layer(model)
model.add(Dense(20, activation='softmax', name='fc2'))
model.compile(loss='sparse_categorical_crossentropy', optimizer='sgd', metrics=['accuracy'])
model.summary()

joelthchao on 6 May 2016

👍23 ❤5 😄3 🎉2

@joelthchao Thanks, I tried and it works like a charm.

bryanyzhu on 6 May 2016

@joelthchao Thanks! You really save my life.

benwu232 on 20 Jul 2016

@bryanyzhu I wanna use this pretrained vgg19. i have 8 label to classlify.

x = Flatten(name='flatten')(x)
x = Dense(4096, activation='relu', name='fc1')(x)
x = Dense(4096, activation='relu', name='fc2')(x)
x = Dense(1000, activation='softmax', name='predictions')(x)
model = Model(img_input, x)
return model

i have only 8 labels to classlify.
But it is used by the kares function api,not the Sequential model.
how can i use the model.layers.pop()
Do you give me some advices? Thanks

alyato on 7 Mar 2017

@alyato Instead of using model.layers.pop(), you can use joelthchao's function pop_layer(model) to finish the job. It should work.

bryanyzhu on 7 Mar 2017

@bryanyzhu Thanks.
I see your code using the Sequential model,i wanna using the keras function api.

you can use joelthchao's function pop_layer(model) to finish the job. It should work.

So i can use the pop_layer(model) in the keras function api.

But model.add() was used in the Sequential model. Then model.add(Dense(20, activation='softmax', name='fc2')) was rewrited by x = Dense(8, activation='softmax', name='predictions')(x)
like this,

x = Dense(4096, activation='relu', name='fc2')(x)
x = Dense(1000, activation='softmax', name='predictions')(x)
model = Model(img_input, x)
pop_layer(model)
x = Dense(8, activation='softmax', name='predictions')(x)
return model

Is it right? thanks

alyato on 8 Mar 2017

@alyato maybe it is right. Or you can change it back to using Sequential model. Sorry i haven't used Keras since then, I am not familiar with it. Couldn't help more.

bryanyzhu on 8 Mar 2017

For some reason I need to build model with popped layer using Model before adding new layers to make things work.

conda list keras
# Name                    Version                   Build  Channel
keras                     2.1.5                    py36_0    conda-forge

Here is code snippet:

def pop_layer(model):
    if not model.outputs:
        raise Exception('Sequential model cannot be popped: model is empty.')

    model.layers.pop()
    if not model.layers:
        model.outputs = []
        model.inbound_nodes = []
        model.outbound_nodes = []
    else:
        model.layers[-1].outbound_nodes = []
        model.outputs = [model.layers[-1].output]
    model.built = False

def get_model():
    #Fully convolutional part of VGG16
    model = VGG16(include_top=False, weights='imagenet')

    #Remove last max pooling layer
    pop_layer(model)

    #Freeze pretrained layers
    for layer in model.layers:
        layer.trainable = False

    model = Model(inputs=model.inputs, outputs=model.outputs)

    print('len(model.layers)', len(model.layers)) #
    print(model.summary()) #

    x = GlobalAveragePooling2D()(model.output)
    head = Dense(N_CLASS, activation='softmax')(x)

    model = Model(inputs=model.inputs, outputs=head)

    model.compile(optimizer=Adadelta(), loss='categorical_crossentropy', metrics=['accuracy'])

    print('len(model.layers)', len(model.layers)) #
    print(model.summary()) #

    return model

Also can someone comment on what is the difference between model.outputs and model.output ?