Incubator-mxnet: Calculate intermediate outputs and gradients?

Created on 27 Feb 2016 · 6Comments · Source: apache/incubator-mxnet

I'm trying to output all outputs and gradients for this small computation graph. If I use a Group symbol I can get all the outputs (in c_exec.outputs) but then the backward pass wants the gradient for each of those nodes which defeats the point of having it calculate them. It seems to only calculate the gradient for arguments and not any of the intermediate outputs.

Can you tell me how to achieve this? If I call net.get_internals() it shows the outputs that I want but then I cannot bind to it to calculate them.

import mxnet as mx
import cmath

w0 = mx.symbol.Variable('w0')
x0 = mx.symbol.Variable('x0')
w1 = mx.symbol.Variable('w1')
x1 = mx.symbol.Variable('x1')
w2 = mx.symbol.Variable('w2')
net = 1/(1 + (mx.symbol.pow(cmath.e, -1*(w0*x0 + w1*x1 + w2))))


args={'w0': mx.nd.ones(1) * 2.0, 
     'x0': mx.nd.ones(1) * -1.0,
     'w1': mx.nd.ones(1) * -3.0, 
     'x1': mx.nd.ones(1) * -2.0,
     'w2': mx.nd.ones(1) * -3.0}
data_shape = (1,)

focus = net
c_exec = focus.simple_bind(ctx=mx.cpu(),
                             w0 = data_shape, 
                             x0 = data_shape,
                             w1 = data_shape, 
                             x1 = data_shape,
                             w2 = data_shape,
                          grad_req='write')


c_exec.copy_params_from(arg_params = args)

c_exec.forward()
c_exec.backward(out_grads=mx.nd.ones(1)*1)

args_dict = dict(zip(args,[o.asnumpy()[0] for o in c_exec.arg_arrays]))
outputs_dict = dict(zip(focus.list_outputs(),[o.asnumpy()[0] for o in c_exec.outputs]))
grads_dict = dict(zip([n for n in args],[o.asnumpy()[0] for o in c_exec.grad_arrays]))

print "Args   : " + str(["%s=%.2f" %(n,o) for n,o in args_dict.iteritems()])
print "Outputs: " + str(["%s=%.2f" %(n,o) for n,o in outputs_dict.iteritems()])
print "Grads  : " + str(["%s=%.2f" %(n,o) for n,o in grads_dict.iteritems()])

Args   : ['x0=2.00', 'x1=-1.00', 'w2=-3.00', 'w1=-2.00', 'w0=-3.00']
Outputs: ['_divscalar0_output=0.73']
Grads  : ['x0=-0.20', 'x1=0.39', 'w2=-0.39', 'w1=-0.59', 'w0=0.20']

Source

ieee8023

Most helpful comment

https://github.com/dmlc/mxnet/blob/223fd62bbcfd2ec03720533c37ddb01893f0be25/example/autoencoder/solver.py#L64

        if debug:
            sym = sym.get_internals()
            blob_names = sym.list_outputs()
            sym_group = []
            for i in range(len(blob_names)):
                if blob_names[i] not in args:
                    x = sym[i]
                    if blob_names[i] not in output_names:
                        x = mx.symbol.BlockGrad(x, name=blob_names[i])
                    sym_group.append(x)
            sym = mx.symbol.Group(sym_group)

Fine tuning models is such a common use case that it should become a first class API instead of a debug code snippet.

futurely on 15 Mar 2016

👍7

All 6 comments

add blockgrad layer on top of the internel nodes.

piiswrong on 28 Feb 2016

So I added a BlockGrad but now c_exec.grad_arrays just contains all 0 values. Do you have an example of how to use it?

import mxnet as mx
import cmath
import numpy as np

w0 = mx.symbol.Variable('w0')
x0 = mx.symbol.Variable('x0')
w1 = mx.symbol.Variable('w1')
x1 = mx.symbol.Variable('x1')
w2 = mx.symbol.Variable('w2')

a = mx.symbol.BlockGrad(-1*(w0*x0 + w1*x1 + w2))

net = 1/(1 + (mx.symbol.pow(cmath.e, a)))

args={'w0': mx.nd.ones(1) * 2.0, 
     'x0': mx.nd.ones(1) * -1.0,
     'w1': mx.nd.ones(1) * -3.0, 
     'x1': mx.nd.ones(1) * -2.0,
     'w2': mx.nd.ones(1) * -3.0}
data_shape = (1,)

focus = net
c_exec = focus.simple_bind(ctx=mx.cpu(),
                             w0 = data_shape, 
                             x0 = data_shape,
                             w1 = data_shape, 
                             x1 = data_shape,
                             w2 = data_shape,
                          grad_req='write')


c_exec.copy_params_from(arg_params = args)

c_exec.forward()
c_exec.backward(out_grads=mx.nd.ones(1)*1)

args_dict = dict(zip(args,[o.asnumpy()[0] for o in c_exec.arg_arrays]))
outputs_dict = dict(zip(focus.list_outputs(),[o.asnumpy()[0] for o in c_exec.outputs]))
grads_dict = dict(zip([n for n in args],[o.asnumpy()[0] for o in c_exec.grad_arrays]))

print "Args   : " + str(["%s=%.2f" %(n,o) for n,o in args_dict.iteritems()])
print "Outputs: " + str(["%s=%.2f" %(n,o) for n,o in outputs_dict.iteritems()])
print "Grads  : " + str(["%s=%.2f" %(n,o) for n,o in grads_dict.iteritems()])

Args   : ['x0=2.00', 'x1=-1.00', 'w2=-3.00', 'w1=-2.00', 'w0=-3.00']
Outputs: ['_divscalar2_output=0.73']
Grads  : ['x0=0.00', 'x1=-0.00', 'w2=0.00', 'w1=0.00', 'w0=-0.00']

ieee8023 on 29 Feb 2016

Add it on top of the internal nodes from get_internals()

piiswrong on 29 Feb 2016

Can you make an example? How do I get the internal nodes from get_internals()?

ieee8023 on 29 Feb 2016

see last block of https://github.com/dmlc/mxnet/blob/master/example/notebooks/predict-with-pretrained-model.ipynb

tqchen on 13 Mar 2016

https://github.com/dmlc/mxnet/blob/223fd62bbcfd2ec03720533c37ddb01893f0be25/example/autoencoder/solver.py#L64

        if debug:
            sym = sym.get_internals()
            blob_names = sym.list_outputs()
            sym_group = []
            for i in range(len(blob_names)):
                if blob_names[i] not in args:
                    x = sym[i]
                    if blob_names[i] not in output_names:
                        x = mx.symbol.BlockGrad(x, name=blob_names[i])
                    sym_group.append(x)
            sym = mx.symbol.Group(sym_group)

Fine tuning models is such a common use case that it should become a first class API instead of a debug code snippet.

futurely on 15 Mar 2016

👍7

Was this page helpful?

0 / 5 - 0 ratings