Keras: shouldn't model.trainable=False freeze weights under the model?

Created on 9 Nov 2017 · 8Comments · Source: keras-team/keras

I am trying to freeze the free trained VGG16's layers ('conv_base' below) and add new layers on top of them for feature extracting.
I expect to get same prediction results from 'conv_base' before(ret1) / after(ret2) fit of model but it is not.
Is this wrong way to check weight freezing?

# loading VGG16 and set to untrainable
conv_base = applications.VGG16(weights='imagenet', include_top=False, input_shape=[150, 150, 3]) conv_base.trainable = False

#result before model fit
ret1 = conv_base.predict(np.ones([1, 150, 150, 3]))

# add layers on top of the VGG16 and compile a model
model = models.Sequential()
model .add(conv_base)
model .add(layers.Flatten())
model .add(layers.Dense(10, activation='relu'))
model .add(layers.Dense(1, activation='sigmoid'))
model.compile('rmsprop', 'binary_crossentropy', ['accuracy'])

# fit the model
model.fit_generator(train_generator, 100, validation_data=validation_generator, validation_steps=50)

#result after model fit
ret2 = conv_base.predict(np.ones([1, 150, 150, 3]))

#hope this is True but it is not.
np.equal(ret1, ret2)

Source

taocp89

👍1

Most helpful comment

I fixed it: https://github.com/fchollet/keras/commit/c25fa38deb4efc5445f64af3ec17eae0eb660d2f

fchollet on 10 Nov 2017

❤3 👍1

All 8 comments

Some of the weights can't be freezed, most notably the running mean and variance of the BatchNormalization. This is a known behaviour that has been causing some confusion for a while.

Have a look on this discussion:
https://github.com/fchollet/keras/issues/4762#issuecomment-299606870

datumbox on 10 Nov 2017

@datumbox Thanks for reply.
Yes, I check the BatchNormalization case will not be frozen whether trainable or not.
But for VGG16 with include_top=False, it has only Conv2D and MaxPool2D (+ Input) which as I know don't have BatchNorm like updating nature.

+) checked np.allclose() instead of np.equal(), error doesn't disappear.

taocp89 on 10 Nov 2017

oops I'm sorry I did not noticed you said VGG16. Indeed the specific architecture does not have any BatchNorm layers.

Can you write a loop over the layers, compare before and after models and pin-point which frozen layers have different weights?

datumbox on 10 Nov 2017

Also how big of a difference are we talking about here on the output? Could it be just rounding errors?

datumbox on 10 Nov 2017

I've identified the issue. In short: when adding a Model or Sequential as the first layer in a Sequential model, the Sequential model will use the preexisting input and output of the model/sequential without calling the model/sequential on a new Input.

What that means is that the conv_base.trainable = False is ineffective, because model doesn't see conv_base itself, it sees all its inner layers instead.

The workaround is to set all inner layers to conv_base as non-trainable:

for layer in conv_base.layers:
   layer.trainable = False

This is kind of a strange behavior so we will fix it. Presumably adding a model/sequential as first layer should still result in the model/sequential being called anew.

fchollet on 10 Nov 2017

I fixed it: https://github.com/fchollet/keras/commit/c25fa38deb4efc5445f64af3ec17eae0eb660d2f

fchollet on 10 Nov 2017

❤3 👍1

if you set model. trainable = False, should it not make layer.trainable for all layers false?

conv_base_model = VGG16(weights='imagenet', input_shape=(150, 150, 3), include_top=False)

conv_base_model.trainable = False

for layer in conv_base_model.layers:
    print(layer.name, layer.trainable)

I am still getting true for all layers.
keras

Am I missing something?

anujgupta82 on 26 Jun 2018

👍2

Yes

for layer in conv_base.layers:
     layer.trainable = False

RohitKeshari on 29 Apr 2019

Was this page helpful?

0 / 5 - 0 ratings

Related issues

Accessing the internal states c of LSTM for all the time steps of each input sequence

yil8 · 3Comments

New predict API for multiple outputs

snakeztc · 3Comments

Cost-sensitive classification

zygmuntz · 3Comments

In training process, validation data are necessary?

Imorton-zd · 3Comments

compile() should not require arguments when not training

kylemcdonald · 3Comments