Using keras 2.2.4 on Windows 7 with Python 3.5
Augment the interface to get_layers so that one can add kernel regularizes to it after forming the model. This allows a much cleaner interface to add regularizes to every layer in a model without having to type them in explicitly.
An example use of the new interface I propose.
def globalWeightDecay(model, decay, weightBias = False):
# Go through ever layer in the model and apply l2 regularization with the weight specified.
for layerIdx, tmp in enumerate(model.layers):
if hasattr(tmp, 'kernel_regularizer'):
model.get_layer(index=layerIdx).kernel_regularizer = keras.regularizers.l2(l=decay)
if weightBias and hasattr(tmp, 'bias_regularizer'):
layer.bias_regularizer = keras.regularizers.l2(l=decay)
#
#
return model
To utilize this interface you would do the following:
model = buildModel() # Builds the initial model
model = globalWeightDecay(model, 1e-2, False)
model.compile(optimizer = keras.optimizers.Adadelta(lr=learningRate), loss = 'logcosh')
model.losses shows would show all the regularizers (in addition to the logcosh specified here).
Feel free to do an API design document and submit it as described in the CONTRIBUTING.md.
I think what I proposed can be implemented much simplier.
model.layers[1].kernel_regularizer = keras.regularizers.l2(l=decay)
Then do a compile and the loss should be included.
I suspect the current is is when compile is called, whatever builds the model hasnt gone through to see that the layer attributes have changed. I would also suggest for all attributes that either keras makes them read only or allows you to modify them and when compile is called, the model is automatically reparsed.
Hey @isaacgerg Thanks for opening this issue, we really need a neat way to do so. Before this feature "does" get implemented, I tried the workaround suggested by @lars76 and from my test, the loss tensor for each regularizer is added to the total loss during model.compile().
@nicolefinnie Correct, it does work for the example listed. However, the paradigm you mentioned uses model.save which has its own set of issues (and unfortunately wont work for me).
The problem seems to be here: https://github.com/keras-team/keras/blob/1336cdb14ff03de754aec6899794742ca91057b2/keras/engine/training.py#L359 When calling compile, keras only accesses model.losses. When Conv2D is created, keras calls self.add_weight https://github.com/keras-team/keras/blob/7b9c8727760b2a8d02e409efaa6ff9e0333b02e1/keras/layers/convolutional.py#L137, but we miss the build, because the Conv2D was already created.
This code seems to work as intended, but a proper interface would still be better:
regularizer = l2(WEIGHT_DECAY / 2)
for weight in model.trainable_weights:
with tf.keras.backend.name_scope("weight_regularizer"):
model.add_loss(regularizer(weight))
Maybe instead of an interface, one could also just add to the optimizers a new parameter weight_decay (SGDW, AdamW). Because I think this is the only case when one needs to add a regularizer after the layer was created.
Any updates on this?
No.
Try this:
# a utility function to add weight decay after the model is defined.
def add_weight_decay(model, weight_decay):
if (weight_decay is None) or (weight_decay == 0.0):
return
# recursion inside the model
def add_decay_loss(m, factor):
if isinstance(m, tf.keras.Model):
for layer in m.layers:
add_decay_loss(layer, factor)
else:
for param in m.trainable_weights:
with tf.keras.backend.name_scope('weight_regularizer'):
regularizer = lambda: tf.keras.regularizers.l2(factor)(param)
m.add_loss(regularizer)
# weight decay and l2 regularization differs by a factor of 2
add_decay_loss(model, weight_decay/2.0)
return
It really boggles the mind that something which is literally a single optimizer parameter in PyTorch and is an essential regularization technique that's used by everyone pretty much all the time there is not really doable in Keras. I've spent hours trying to track down an "official" way of doing this, and incredibly there isn't one. I ended up cloning my backbone network (which comes from Keras Applications) and adding L2 regularizers manually there.
Most helpful comment
It really boggles the mind that something which is literally a single optimizer parameter in PyTorch and is an essential regularization technique that's used by everyone pretty much all the time there is not really doable in Keras. I've spent hours trying to track down an "official" way of doing this, and incredibly there isn't one. I ended up cloning my backbone network (which comes from Keras Applications) and adding L2 regularizers manually there.