Incubator-mxnet: A loss layer using softmax outputs as input?

Created on 4 Apr 2016 · 5Comments · Source: apache/incubator-mxnet

Hello,

I am designing a network, which, in the end, uses a weighted summation of more than one softmax outputs. So, this final summation should act as the loss layer, instead of a single softmax output. I wonder whether this is possible simply by multiplying n softmax layers with n weights repeatedly and summing them into a final summation operator, for example in a loop and then using the final summation operator as the loss layer? It is something like that in pseudocode:

summation = weight[0] * softmax[0]
for i in (1,N):
summation = summation + (weight[i]*softmax[i])

In the end "summation" should be the final loss term. I am in doubt whether this will work, since it seems that softmax output operator does not receive gradients from operators coming after it. It is designed as being the 'final layer'. Is this right? What will be the best way to implement this by using existing MxNet symbolic operators, without having to code my own operator somehow?

Thanks in advance.

Source

ufukcbicici

👍3

Most helpful comment

You can achieve the same thing by grouping several SoftmaxLoss layers and using grad_scale for the weighting of the losses.

softmax1 = mx.SoftmaxOutput(..., grad_scale = 0.67)
softmax2 = mx.SoftmaxOutput(..., grad_scale = 0.33)
net = mx.Group(softmax1, softmax2)

vchuravy on 20 May 2016

👍4

All 5 comments

Use SoftmaxActivation.

piiswrong on 5 Apr 2016

👍1

I am getting "Softmax Activation for internal layers is only supported on GPU with cuDNN. Use SoftmaxOutput for loss layer." error when I try to use it. I called the bind method on the network by using gpu as the context, but this did not help either.

ufukcbicici on 5 Apr 2016

I'm also stuck here. @piiswrong: Can you please provide some code sample for the weighted loss? Thanks.

diPDew on 20 May 2016

You can achieve the same thing by grouping several SoftmaxLoss layers and using grad_scale for the weighting of the losses.

softmax1 = mx.SoftmaxOutput(..., grad_scale = 0.67)
softmax2 = mx.SoftmaxOutput(..., grad_scale = 0.33)
net = mx.Group(softmax1, softmax2)

vchuravy on 20 May 2016

👍4

Yes, you can, the weight assignment can be done in definition of previous layers

tqchen on 20 May 2016

Was this page helpful?

0 / 5 - 0 ratings

Related issues

No module named bbox when running rcnn demo.py

realbns2008 · 3Comments

im2rec tool

luoruisichuan · 3Comments

Use R, how to manually confficients predict y^, make the result equal to the function `predict`?

GuilongZh · 3Comments

fine-tuning and freezing layers

yuconglin · 3Comments

CMake Error: The following variables are used in this project, but they are set to NOTFOUND.

zy-huang · 3Comments