Incubator-mxnet: Is there a simple way to make two similar networks share same weights?

Created on 23 Aug 2016 · 3Comments · Source: apache/incubator-mxnet

Hi, i have some doubts about sharing weights.

Here is my naive way to define siamese network.

data1 = mx.sym.Variable("data1")
data2 = mx.sym.Variable("data2")
conv_weight = mx.sym.Variable("conv_weight")
conv_bias = mx.sym.Variable("conv_bias")    
conv_1 = mx.symbol.Convolution(data=data1, weight=conv_weight, bias=conv_bias, kernel=(3, 3), pad=(1, 1), num_filter=64)
conv_2= mx.symbol.Convolution(data=data1, weight=conv_weight, bias=conv_bias, kernel=(3, 3), pad=(1, 1), num_filter=64)

Another try, but failed

I tried defining the name of every layer. But there is ValueError: Duplicate names detected, ['data1', 'conv_weight', 'conv_bias', 'data2', 'conv_weight', 'conv_bias']

import mxnet as mx
data1 = mx.sym.Variable("data1")
data2 = mx.sym.Variable("data2")  
conv_1 = mx.symbol.Convolution(data=data1, kernel=(3, 3), pad=(1, 1), num_filter=64, name='conv')
conv_2= mx.symbol.Convolution(data=data2, kernel=(3, 3), pad=(1, 1), num_filter=64, name='conv')
net = mx.sym.Concat(conv_1, conv_2)
exe = net.simple_bind(data1=(3,3,224,224), data2=(3,3,224,224), ctx=mx.gpu(0))

Simple Way to Define Siamese Network?

To define a siamese network, I have to define two similar networks with same weights params, which is inefficient, especially when the network is deep. Is there any simple way to make two similar networks share same weights just like

define network():
      ..........
net_1 = network()
net_2 = network()
make_two_networks_share_weights(net_1,net_2)
net = mx.sym.square(data=sym1-sym2)
net = mx.sym.sum(data=net)

Any example code or hits will be appreciated, thanks.

Source

xzqjack

Most helpful comment

@piiswrong
Thanks for your reply.
The first approach is ok when someone need to define a network from beginning. But it's annoying when someone need to fine-tune from other models. For example, if i want to define a siamese network by loading vgg16, what i want to do is to load defined symbol and model params without changging layer params, but not to define the whole network.

xzqjack on 23 Aug 2016

👍6

All 3 comments

The first approach is the right way to do it. It can be made easy if you use factory functions to define convlayers. See example/image-classification/inceptionv3 for example