Incubator-mxnet: Fine tuning Cifar-100 model freezing all layers except the last one.

Created on 23 Jan 2016  路  7Comments  路  Source: apache/incubator-mxnet

I have trained a cifar-100 model using https://github.com/dmlc/mxnet/blob/master/example/notebooks/cifar-100.ipynb .

I need to finetune the model, freezing all the layers except the last one. How can we use symbol.blockGrad to freeze the layers? Is there any other method ?

Most helpful comment

BTW, strictly speaking, using blockGrad cannot freeze the weight since weight decay may also change the weight.

All 7 comments

Three ways to do it:

  1. Insert blockGrad before last layer. it will set gradient to zero for lower layers.
  2. Only bind the gradient of the last layer in symbol.bind
  3. specify grad_req='null' for all layers except the last in symbol.simple_bind.

BTW, strictly speaking, using blockGrad cannot freeze the weight since weight decay may also change the weight.

The updater currently is unaware of layer-wise configuration.

def get_updater(optimizer):
    """Return a clossure of the updater needed for kvstore
    Parameters
    ----------
    optimizer: Optimizer
         The optimizer
    Returns
    -------
    updater: function
         The clossure of the updater
    """
    states = dict()
    def updater(index, grad, weight):
        """updater for kvstore"""
        if index not in states:
            states[index] = optimizer.create_state(index, weight)
        optimizer.update(index, weight, grad, states[index])
    return updater

This is one of the most common use cases. Have not anyone done this before with MXNet?

Would you please restart the effort of https://github.com/dmlc/mxnet/issues/830 to make common tasks very easy to finish?

The hacky way to block gradients is https://github.com/dmlc/mxnet/issues/1538#issuecomment-196691988.

Was this page helpful?
0 / 5 - 0 ratings