I have trained a cifar-100 model using https://github.com/dmlc/mxnet/blob/master/example/notebooks/cifar-100.ipynb .
I need to finetune the model, freezing all the layers except the last one. How can we use symbol.blockGrad to freeze the layers? Is there any other method ?
Three ways to do it:
BTW, strictly speaking, using blockGrad cannot freeze the weight since weight decay may also change the weight.
The updater currently is unaware of layer-wise configuration.
def get_updater(optimizer):
"""Return a clossure of the updater needed for kvstore
Parameters
----------
optimizer: Optimizer
The optimizer
Returns
-------
updater: function
The clossure of the updater
"""
states = dict()
def updater(index, grad, weight):
"""updater for kvstore"""
if index not in states:
states[index] = optimizer.create_state(index, weight)
optimizer.update(index, weight, grad, states[index])
return updater
This is one of the most common use cases. Have not anyone done this before with MXNet?
Would you please restart the effort of https://github.com/dmlc/mxnet/issues/830 to make common tasks very easy to finish?
The hacky way to block gradients is https://github.com/dmlc/mxnet/issues/1538#issuecomment-196691988.
Most helpful comment
BTW, strictly speaking, using blockGrad cannot freeze the weight since weight decay may also change the weight.