I was trying to freeze conv layers of vgg16 and only train the last few fc layers.
I followed http://mxnet.io/how_to/finetune.html and https://github.com/dmlc/mxnet/issues/4616.
according to http://mxnet.io/api/python/module.html, when we provide arg_params in mx.modul.fit function,
"the value here will be used to initialize the module parameters, unless they are already initialized by the user via a call to init_params or fit. arg_params has higher priority to initializer."
So, it seems that the parameters from the pre-trained model are used as initial values instead of frozen.
Am I correct?
How to freeze the parameters of layers that I don't intend to train?
Thanks a lot for your help !!!
in module's constructor there is fixed_param_names. weights with names in the list will be frozen
@piiswrong
http://mxnet.incubator.apache.org/api/python/module/module.html#mxnet.module.Module
In modules constructor I provide the list of parameters that are to be freezed(in fixed_param_names
) in the new network.
But what about the auxiliary variables from the old network that are copied to the new network. Are they going to be freezed as well during the training of the new network, such as moving mean and moving var of batch norm?
Thank you.
aux variables are never updated by optimizer. The operators decide what to store in them.
Most helpful comment
in module's constructor there is fixed_param_names. weights with names in the list will be frozen