I failed in using inception_v3 on my own dataset. (Ubuntu14.04, cuda8.0, python3.6.2)
It outputs warning when loaded:
/home/ljy/anaconda3/lib/python3.6/site-packages/torchvision-0.1.9-py3.6.egg/torchvision/models/inception.py:65: UserWarning: src is not broadcastable to dst, but they have the same number of elements. Falling back to deprecated pointwise behavior.
It failed which training:
Traceback (most recent call last):
File "/home/ljy/pytorch-examples-master/cub_pytorch/main.py", line 382, in <module>
main()
File "/home/ljy/pytorch-examples-master/cub_pytorch/main.py", line 213, in main
train(train_loader, model, criterion, optimizer, epoch)
File "/home/ljy/pytorch-examples-master/cub_pytorch/main.py", line 251, in train
loss = criterion(output, target_var)
File "/home/ljy/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 224, in __call__
result = self.forward(*input, **kwargs)
File "/home/ljy/anaconda3/lib/python3.6/site-packages/torch/nn/modules/loss.py", line 482, in forward
self.ignore_index)
File "/home/ljy/anaconda3/lib/python3.6/site-packages/torch/nn/functional.py", line 746, in cross_entropy
return nll_loss(log_softmax(input), target, weight, size_average, ignore_index)
File "/home/ljy/anaconda3/lib/python3.6/site-packages/torch/nn/functional.py", line 537, in log_softmax
return _functions.thnn.LogSoftmax.apply(input)
File "/home/ljy/anaconda3/lib/python3.6/site-packages/torch/nn/_functions/thnn/auto.py", line 126, in forward
ctx._backend = type2backend[type(input)]
File "/home/ljy/anaconda3/lib/python3.6/site-packages/torch/_thnn/__init__.py", line 15, in __getitem__
return self.backends[name].load()
KeyError: <class 'tuple'>
Hi @MichaelLiang12,
What PyTorch version are you using (found by torch.__version__), also can you provide us with a minimum working example to reproduce this?
Thanks
Also the user warning you are getting when loading the model is fixed in master (via #231)
Same issue:
(tensorflow) wcai@tdtd-desktop ~/tensorflow/AI_competition/pytorch $ python main.py -a inception_v3 . --pretrained
=> using pre-trained model 'inception_v3'
/home/wcai/tensorflow/lib/python3.5/site-packages/torchvision/models/inception.py:65: UserWarning: src is not broadcastable to dst, but they have the same number of elements. Falling back to deprecated pointwise behavior.
m.weight.data.copy_(values)
Traceback (most recent call last):
File "main.py", line 353, in
main()
File "main.py", line 176, in main
train(train_loader, model, criterion, optimizer, epoch)
File "main.py", line 214, in train
loss = criterion(output, target_var)
File "/home/wcai/tensorflow/lib/python3.5/site-packages/torch/nn/modules/module.py", line 224, in __call__
result = self.forward(input, *kwargs)
File "/home/wcai/tensorflow/lib/python3.5/site-packages/torch/nn/modules/loss.py", line 482, in forward
self.ignore_index)
File "/home/wcai/tensorflow/lib/python3.5/site-packages/torch/nn/functional.py", line 746, in cross_entropy
return nll_loss(log_softmax(input), target, weight, size_average, ignore_index)
File "/home/wcai/tensorflow/lib/python3.5/site-packages/torch/nn/functional.py", line 537, in log_softmax
return _functions.thnn.LogSoftmax.apply(input)
File "/home/wcai/tensorflow/lib/python3.5/site-packages/torch/nn/_functions/thnn/auto.py", line 126, in forward
ctx._backend = type2backend[type(input)]
File "/home/wcai/tensorflow/lib/python3.5/site-packages/torch/_thnn/__init__.py", line 15, in __getitem__
return self.backends[name].load()
KeyError:
Python: Python 3.5.2
print (torch.__version__)
0.2.0_3
Hi @jamiechoi1995,
Can you provide a minimum working example of this failing (i.e. an input that causes this when you pass it the model).
From the stack trace it seems like the input to the loss is a tuple, instead of a Variable.
Hi @alykhantejani
You can reproduce this problom by using the code in https://github.com/pytorch/examples/tree/master/imagenet
I modify the size of rescale and crop to 299 for inception v3,
and my train&validate data are jpg files and the corresponding json files.
Using the same code with size of 224 in resnet model is OK,
but when I swith it to inception v3, I got this problem.
Thanks.
Isn't this problem because the Aux error branch in the network? If you remove it it should work :)
@jamiechoi1995 @MichaelLiang12, @TiRune is correct, inception_v3 has an aux branch, and if this is not disabled the forward function will return a tuple (see here), which when passed to the criterion will throw this error.
So you have two choices:
1) disable aux_logits when the model is created here by also passing aux_logits=False to the inception_v3 function.
2) edit your train function to accept and unpack the returned tuple here to be something like:
output, aux = model(input_var)
@alykhantejani:Hi, why we have to disable the aux_logits?, what are these aux_logits? does they effect the training/validation?
I'm trying to reproduce the accuracy from a model trained using with the bvlc_googlenet (without pretrained weights). So when I do aux branch off with pytorch(googlenet) it works and reports val_acc with 50% which is very low when compared to the caffe. any other methods to reproduce the same accurcy using pytorch?
Thanks.
@jamiechoi1995 @MichaelLiang12, @TiRune is correct, inception_v3 has an aux branch, and if this is not disabled the forward function will return a tuple (see here), which when passed to the criterion will throw this error.
So you have two choices:
1. disable `aux_logits` when the model is created [here](https://github.com/pytorch/examples/blob/master/imagenet/main.py#L75) by also passing `aux_logits=False` to the `inception_v3` function. 2. edit your `train` function to accept and unpack the returned tuple [here](https://github.com/pytorch/examples/blob/master/imagenet/main.py#L194) to be something like:output, aux = model(input_var)
@rajasekharponakala the aux_logits is a separate classifier that is added to help during training, but it is not used during inference.
I'm trying to reproduce the accuracy from a model trained using with the bvlc_googlenet (without pretrained weights). So when I do aux branch off with pytorch(googlenet) it works and reports val_acc with 50% which is very low when compared to the caffe. any other methods to reproduce the same accurcy using pytorch?
Both googlenet and inception_v3 use pre-trained weights from TensorFlow, and as far as I know we didn't manage to reproduce accuracies from the paper when training from scratch.
Hi @fmassa, thanks. I followed (pytorch discourse) to add below lines in train() imagenet example.
output, aux = model(input_var)
loss1 = criterion(output, target)
loss2 = criterion(aux, target)
loss = loss1 + 0.4*loss2
but ended with error:
Traceback (most recent call last):
File "imagenet.py", line 407, in <module>
main()
File "imagenet.py", line 114, in main
main_worker(args.gpu, ngpus_per_node, args)
File "imagenet.py", line 240, in main_worker
train(train_loader, model, criterion, optimizer, epoch, args)
File "imagenet.py", line 281, in train
output, aux = model(input)
ValueError: too many values to unpack (expected 2)
any idea?
you need to set your model to train() mode, it's probably in eval mode
Thanks. Yes, I'm following the example/imagenet/main.py script:
def main()
...
def main_worker()
...
def train()
....
model.train()
....
outputs, aux_outputs = model(inputs)
loss1 = criterion(outputs, target)
loss2 = criterion(aux_outputs, target)
loss = loss1 + 0.4*loss2
def validate()
...
model.eval()
...
outputs = model(inputs)
loss = criterion(outputs, target)
....
def adjust_learning_rate()
...
def accuracy()
...
I found some other method in dicourse
output = model(input)
loss = None
# for nets that have multiple outputs such as inception
if isinstance(output, tuple):
loss = sum((criterion(o,target) for o in output))
else:
loss = criterion(output, target)
This times it throws different error:
Traceback (most recent call last):
File "imagenet.py", line 417, in <module>
main()
File "imagenet.py", line 114, in main
main_worker(args.gpu, ngpus_per_node, args)
File "imagenet.py", line 240, in main_worker
train(train_loader, model, criterion, optimizer, epoch, args)
File "imagenet.py", line 298, in train
acc1, acc5 = accuracy(output, target, topk=(1, 5))
File "imagenet.py", line 405, in accuracy
_, pred = output.topk(maxk, 1, True, True)
AttributeError: 'tuple' object has no attribute 'topk'
The issue is that both googlenet and inception can return auxiliary classifiers in training mode.
Your code is not taking that into account, or you didn't set aux classifiers. Double-check that and you'll be able to find the issue.
Yeah. def main_worker() set to
if args.pretrained:
print("=> using pre-trained model '{}'".format(args.arch))
model = models.__dict__[args.arch](pretrained=True)
else:
print("=> creating model '{}'".format(args.arch))
model = models.__dict__[args.arch](aux_logits=True)
and also vision/models/googlenet.py has
class GoogLeNet(nn.Module):
def __init__(self, num_classes=1000, aux_logits=True, transform_input=False, init_weights=True):
super(GoogLeNet, self).__init__()
self.aux_logits = aux_logits
self.transform_input = transform_input
.....
def forward() #has self.aux_logits
@rajasekharponakala one thing to note here is that GoogLeNet has two aux branches where as inception v3 only has one.
So for GoogLeNet you have to use:
aux1, aux2, output = model(inputs)
@TheCodez: Thanks, its working now!
format:
aux1, aux2, output = model(inputs)
loss1 = criterion(outputs, target)
loss2 = criterion(aux1, target)
loss3 = criterion(aux2, target)
loss = loss1 + 0.4*(loss2+loss3)
@rajasekharponakala the correct weighting scheme for GoogLeNet is using 0.3:
aux1, aux2, output = model(inputs)
loss1 = criterion(outputs, target)
loss2 = criterion(aux1, target)
loss3 = criterion(aux2, target)
loss = loss1 + 0.3 * (loss2 + loss3)
Yeah, thanks.
@TheCodez @fmassa @alykhantejani @rajasekharponakala Do we have to set auxiliary classifiers in test mode? I get very poor test accuracy when I retrieve trained model ( auxiliary classifiers are set here). I'm using inception v3 model for my task!
@tejasri19 for inference, don't forget to set your model to eval() mode.
You don't need to use the aux classifiers for inference, only for training
Hi, i have a question. In the https://github.com/pytorch/vision/blob/master/torchvision/models/googlenet.py
it's
if self.training and self.aux_logits:
return _GoogLeNetOutputs(x, aux2, aux1)
return x
_GoogLeNetOutputs = namedtuple('GoogLeNetOutputs', ['logits', 'aux_logits2', 'aux_logits1'])
so, should it be
output, aux2, aux1 = model(inputs)
but not
aux1, aux2, output = model(inputs)
Is it right? Thanks.
It should be output, aux2, aux1.
Thanks for this thread it really helped me but now I'm getting this error when unpacking the model output:
output, aux1= model(data)
ValueError: too many values to unpack (expected 2)
and even when I added an extra output to unpack:
output, aux2, aux1 = model(data)
I still have the following error:
not enough values to unpack (expected 3, got 2)
I solved it by unpacking the output in seperatelly:
output = model(data).logits
aux1 = model(data).aux_logits
It seems that there are extra outputs such as counts that I don't believe we need for training
@gamesMum I would advise not to do that, as you are essentially running your model twice.
Instead just use this once:
output = model(data)
and then access using:
output.logits
output.aux_logits
@TheCodez oh dear how did I kiss that!
Thanks for pointing this out
Most helpful comment
@jamiechoi1995 @MichaelLiang12, @TiRune is correct, inception_v3 has an aux branch, and if this is not disabled the forward function will return a tuple (see here), which when passed to the criterion will throw this error.
So you have two choices:
1) disable
aux_logitswhen the model is created here by also passingaux_logits=Falseto theinception_v3function.2) edit your
trainfunction to accept and unpack the returned tuple here to be something like: