Say I have a pretrained model in mxnet format. Now I want to change some specific layers architecture and finetune this new net. I want to ask:
How did mxnet load the old symbol and weights, and how did mxnet decide which layers use the old net structures and weights and which layers construct new structures and init their weight from scratch. Decided by the layers' name just like caffe?
Take the alexnet as an example, say I want to change the first full connect layer to a conv and finetune it.
I think it somehow like this:
loaded = mx.model.FeedForward.load(prefix, epoch)load.symbol(how?)finetune_model = mx.model.FeedForward(ctx, symbol=new_symbol, new_arg_params, new_aux_params)finetune_model.fix()I would also like to know this. An example as simple as reading in a pretrained inception network and changing the number of classes from 1000 to 200, and fine-tuning that final layer would be extremely helpful. I was hoping that using the train_imagenet.py with my own data.rec and setting num_classes would work, but this leads to:
mxnet.base.MXNetError: [13:51:13] src/ndarray/ndarray.cc:159: Check failed: from.shape() == to->shape() operands shape mismatch
Also tried setting the num_hidden in the symbol to 200, but still got the same error.
Changing the loaded symbol is doable, but it's a hassle.
Instead, I would create a new symbol and load parameters with mx.init.Load initializer
doesn't this also cause a mismatch since params file has 1008 in the final layer and my desired symbol has 200? Or am I supposed to load them separately and use getinternals() to somehow attach a new layer and remove the 1008 layer?
If a layer has different dimensions, it should have a different name, then it will be initialized by default initializer instead of loaded.
That's what I was missing. Thanks a lot for clearing that up for me!
If the new model has fewer layers than the old one ,Does it matter?
Most helpful comment
Changing the loaded symbol is doable, but it's a hassle.
Instead, I would create a new symbol and load parameters with mx.init.Load initializer