Incubator-mxnet: MakeLoss should accept optional output symbol

Created on 20 Jan 2017  路  4Comments  路  Source: apache/incubator-mxnet

I would like to suggest that the MakeLoss symbol constructor takes an optional output symbol. When it is provided, the forward output of MakeLoss would be the same as the provided output symbol, while the backward operation would be using the gradient of the input loss.

The value of this approach is that it means you can more easily retrieve the prediction values when using an executor bound to a custom loss without a second executor for the earlier prediction symbol. In particular, Module's predict method would then correctly output the prediction, not the error, when built around a custom MakeLoss loss symbol.

Example usage for "custom" squared error on liner regression:

x = mx.sym.Variable('data')
y = mx.sym.FullyConnected(data=data, num_hidden=1)
label = mx.sym.Variable('label')
loss = mx.sym.MakeLoss(mx.sym.square(y - label), output=y)

ex = loss.simple_bind(mx.cpu(), data=(32, 2))
# assign inputs and label here...
prediction = ex.forward(is_train=True) # returns predicted values instead of loss!
ex.backward()
# now doing training update step using gradients in ex; e.g, with an optimizer...

Most helpful comment

Having error come through as an additional output would be fine with me!

Regarding your workaround, group alone doesn't work of course, but I had not thought to BlockGrad the output in the group. (I didn't realize that had no top gradient. I thought it merely zeroed any gradient coming in). Does that mean it would look like the following:

x = mx.sym.Variable('data')
y = mx.sym.FullyConnected(data=x, num_hidden=1)
label = mx.sym.Variable('label')
loss = mx.sym.MakeLoss(mx.sym.square(y - label))
pred_loss = mx.sym.Group([mx.sym.BlockGrad(y), loss])

ex = pred_loss.simple_bind(mx.cpu(), data=(32, 2))

Not a terrible workaround for the time being. (Testing in a python shell, that at least runs when you call backward!) Thanks!

All 4 comments

This is a good idea. Although I think it should output both the loss and the prediction (as the second output) also there should be a flag that controls this, which should be off by default.

Alternatively you can group the prediction (with blockgrad on it) with the loss and it will do the same thing

Having error come through as an additional output would be fine with me!

Regarding your workaround, group alone doesn't work of course, but I had not thought to BlockGrad the output in the group. (I didn't realize that had no top gradient. I thought it merely zeroed any gradient coming in). Does that mean it would look like the following:

x = mx.sym.Variable('data')
y = mx.sym.FullyConnected(data=x, num_hidden=1)
label = mx.sym.Variable('label')
loss = mx.sym.MakeLoss(mx.sym.square(y - label))
pred_loss = mx.sym.Group([mx.sym.BlockGrad(y), loss])

ex = pred_loss.simple_bind(mx.cpu(), data=(32, 2))

Not a terrible workaround for the time being. (Testing in a python shell, that at least runs when you call backward!) Thanks!

Yes that works

This issue is closed due to lack of activity in the last 90 days. Feel free to reopen if this is still an active issue. Thanks!

Was this page helpful?
0 / 5 - 0 ratings