Incubator-mxnet: How do I make a siamese network with pretrained models (esp. keeping the weights the same?)

Created on 8 Nov 2017 · 10Comments · Source: apache/incubator-mxnet

Description

How do I ensure the weights are kept the same? Can I unpack the internal layers somehow and set the weights of each to the same variable? My apologies, I'm new to MXNet. Would really appreciate the help, thanks!

sym1, arg_params1, aux_params1 = mx.model.load_checkpoint('resnet-152', 0)
sym2, arg_params2, aux_params2 = mx.model.load_checkpoint('resnet-152', 0)
layer1 = sym1.get_internals()
layer2 = sym2.get_internals()
for i in range(len(layer1)): # will something like this work?
    arg_params1[i] = arg_params2[i]

Relevant answers, but not specific enough to my particular problem:
https://github.com/apache/incubator-mxnet/issues/772 siamese networks
https://github.com/apache/incubator-mxnet/issues/6791 extract layers as variables
https://github.com/apache/incubator-mxnet/issues/557 set weights to be same

needs triage

Source

tz-hmc

Most helpful comment

I got a similar problem and achieve to got several solutions: https://github.com/apache/incubator-mxnet/issues/7530
With the Gluon API it's easy and straight forward, with module API that something else :(
I put my test here: https://github.com/edmBernard/mxnet_example_shared_weight
Readme describe if it's work or not

edmBernard on 8 Nov 2017

👍4

All 10 comments

edmBernard on 8 Nov 2017

👍4

Wow, I didn't know that API existed. I had a lot of trouble trying to make it work with the module API but the Gluon API looks super promising, thanks for sharing :)

However, though I'll definitely test Gluon out, do you know how I would do this with the Module API?

Can I extract the each layer's functionality somehow and set the weights to the same variable as its identical layer in the other network? If it's too big a hassle, I guess I would use Gluon, though all the other code I have uses the Module API.

tz-hmc on 8 Nov 2017

If you have exactly the same network two time, it might be possible to use shared_module in bind function. it's use in RNN to duplicate network. I was not able to use it, as my two networks were not exactly the same. here
In my opinion, It will be easier to switch to Gluon and you will be sure it will work.
More you can use in Gluon your network define with symbol API. here (I don't have test it)

edmBernard on 9 Nov 2017

Hey again,

I tried something like this, but I still have a lot of questions:

    sym1, arg_params, aux_params = get_model()
    sym2, arg_params, aux_params = get_model()

    mod1 = mx.mod.Module(symbol=sym1, context=mx.cpu(), label_names=None)
    mod2 = mx.mod.Module(symbol=sym2, context=mx.cpu(), label_names=None)
    mod1.bind(for_training=True, shared_module=mod2, data_shapes=[('data', (1,3,224,224))], # true to train
             label_shapes=mod1._label_shapes)
    mod2.bind(for_training=True, shared_module=mod1, data_shapes=[('data', (1,3,224,224))], # true to train
             label_shapes=mod2._label_shapes)
    mod1.set_params(arg_params, aux_params, allow_missing=True)
    mod2.set_params(arg_params, aux_params, allow_missing=True)

    out1 = sym1.get_internals()['flatten0_output']
    out2 = sym2.get_internals()['flatten0_output']
    siamese_out = mx.sym.Concat(out1, out2, dim=0)

    # Example stacked network after it
    fc1  = mx.symbol.FullyConnected(data = siamese_out, name='fc1', num_hidden=128)
    act1 = mx.symbol.Activation(data = fc1, name='relu1', act_type="relu")
    fc2  = mx.symbol.FullyConnected(data = act1, name = 'fc2', num_hidden = 64)
    act2 = mx.symbol.Activation(data = fc2, name='relu2', act_type="relu")
    fc3  = mx.symbol.FullyConnected(data = act2, name='fc3', num_hidden=num_classes)
    mlp  = mx.symbol.SoftmaxOutput(data = fc3, name = 'softmax')
    # new_args = dict()

    mod3 = mx.mod.Module(fc1, context=mx.cpu(), label_names=None)
    mod3 = fe_mod.bind(for_training=False, data_shapes=[('data', (1,3,224,224))])
    mod3.set_params(arg_params, aux_params)

I only want the first part of this network (layers attached to mod2 & mod1) to be shared. Would something like this work & still backpropagate errors appropriately when fitted?

Having to run mod.fit on each part of the network could be inconvenient. Is there a way around this?

tz-hmc on 10 Nov 2017

I don't test shared_module in something similar to you application. (Are you sure you don't want to use Gluon ?) :)

I don't test your code but some corrections :

# you don't need `shared_module=mod2`
mod1.bind(for_training=True, shared_module=mod2, data_shapes=[('data', (1,3,224,224))], label_shapes=mod1._label_shapes)

If you want to train everythings as one network, you need to define a new Data Iterator that is able to pass two different image in you network.

Maybe it's easier to try this example of triplet loss network (I don't test if it work)

edmBernard on 10 Nov 2017

here an example using Gluon

edmBernard on 10 Nov 2017

👍1

Wow. Thank you so much. Alright, this gives me a lot to think about.
I'm really grateful for your help, thanks a ton.

tz-hmc on 10 Nov 2017

If you want to share weights across the network, why not just use one copy of the network and run it twice with the inputs?

final_net(nd.concat(shared_net(x), shared_net(x)))

Also, I definitely recommend using Gluon instead of pure MxNet

aidan-plenert-macdonald on 10 Nov 2017

👍1

@apache/mxnet-committers: This issue has been inactive for the past 90 days. It has no label and needs triage.

For general "how-to" questions, our user forum (and Chinese version) is a good place to get help.

szha on 10 Feb 2018

@tz-hmc, Hope your question has been answered.
For general "how-to" questions, our user forum (and Chinese version) is a good place to get help.

nswamy on 20 Mar 2018

Was this page helpful?

0 / 5 - 0 ratings

Related issues

CMake Error: The following variables are used in this project, but they are set to NOTFOUND.

zy-huang · 3Comments

what's the usage of ' is_train' in forward?

xzqjack · 3Comments

Missing constant symbol (equivalent of tf.constant)?

Ajoo · 3Comments

a error! Mxnet will crash!

dushoufu · 3Comments

Automatic Batching for Dynamic Graphs

sbodenstein · 3Comments