Pytorch: about torch.nn.CrossEntropyLoss

Created on 14 Apr 2017 · 3Comments · Source: pytorch/pytorch

about torch.nn.CrossEntropyLoss,
i'm learning pytorch, and taking the anpr project
(https://github.com/matthewearl/deep-anpr,
http://matthewearl.github.io/2016/05/06/cnn-anpr/)
as a exercise, transplant it to pytorch platform.

there is a problem,i'm using nn.CrossEntropyLoss() as loss function:
criterion=nn.CrossEntropyLoss()

the output.data of model is:
1.00000e-02 *
-2.5552 2.7582 2.5368 ... 5.6184 1.2288 -0.0076
-0.7033 1.3167 -1.0966 ... 4.7249 1.3217 1.8367
-0.7592 1.4777 1.8095 ... 0.8733 1.2417 1.1521
-0.1040 -0.7054 -3.4862 ... 4.7703 2.9595 1.4263
[torch.FloatTensor of size 4x253]

and targets.data is:
1 0 0 ... 0 0 0
1 0 0 ... 0 0 0
1 0 0 ... 0 0 0
1 0 0 ... 0 0 0
[torch.DoubleTensor of size 4x253]

when i call:
loss=criterion(output,targets)
error occured,information is:
TypeError: FloatClassNLLCriterion_updateOutput received an invalid combination of arguments - got (int, torch.FloatTensor, torch.DoubleTensor, torch.FloatTensor, bool, NoneType, torch.FloatTensor), but expected (int state, torch.FloatTensor input, torch.LongTensor target, torch.FloatTensor output, bool sizeAverage, [torch.FloatTensor weights or None], torch.FloatTensor total_weight)

'expected torch.LongTensor','got torch.DoubleTensor',but if i convert the targets into LongTensor:
torch.LongTensor(numpy.array(targets.data.numpy(),numpy.long))
call loss=criterion(output,targets), the error is:
RuntimeError: multi-target not supported at /data/users/soumith/miniconda2/conda-bld/pytorch-0.1.10_1488752595704/work/torch/lib/THNN/generic/ClassNLLCriterion.c:20

my last exercise is mnist, a example from pytorch,i made a bit modification,batch_size is 4,the loss function:
loss = F.nll_loss(outputs, labels)
outputs.data:
-2.3220 -2.1229 -2.3395 -2.3391 -2.5270 -2.3269 -2.1055 -2.2321 -2.4943 -2.2996
-2.3653 -2.2034 -2.4437 -2.2708 -2.5114 -2.3286 -2.1921 -2.1771 -2.3343 -2.2533
-2.2809 -2.2119 -2.3872 -2.2190 -2.4610 -2.2946 -2.2053 -2.3192 -2.3674 -2.3100
-2.3715 -2.1455 -2.4199 -2.4177 -2.4565 -2.2812 -2.2467 -2.1144 -2.3321 -2.3009
[torch.FloatTensor of size 4x10]

labels.data:
8
6
0
1
[torch.LongTensor of size 4]

the labels, for a input image,must be a single element, in upper example, there is 253 numbers, and in 'mnist',there is only one number, the shape of outputs is difference from labels.

i review the tensorflow manual, tf.nn.softmax_cross_entropy_with_logits,
'Logits and labels must have the sameshape [batch_size, num_classes] and the same dtype (either float32 or float64).'

so,can i using pytorch in this case, or how can i do?
many thks

Source

dablyo

Most helpful comment

I wouldn't say that TF docs are the best place to learn about PyTorch API 🙂 We're not trying to be compatible with TF, and our CrossEntropyLoss accepts a vector of class indices (this allows it to run much faster than if it used 1-hot vectors). It should be straightforward to convert between both representations if you really need to.

Note that we're using GitHub issues for bug reports only. If you have any questions, please ask them on our forums.

apaszke on 14 Apr 2017

👍6 🎉1

All 3 comments

http://pytorch.org/docs/nn.html#crossentropyloss
CrossEntropyLoss Shape:
Input: (N,C) where C = number of classes
Target: (N) where each value is 0 <= targets[i] <= C-1

do i have any other choice?

dablyo on 14 Apr 2017

Note that we're using GitHub issues for bug reports only. If you have any questions, please ask them on our forums.

apaszke on 14 Apr 2017

👍6 🎉1

Thanks, converting the one-hot class encoding matrix to an integer vector fixed the CrossEntropyLoss problem for me!