Hello, I have trained a model, when I want to resume it in a bigger dataset, I encounter this problem:
loading checkpoint ./trained_models/vgg16/pascal_voc/faster_rcnn_1_1_41.pth
loaded checkpoint ./trained_models/vgg16/pascal_voc/faster_rcnn_1_1_41.pth
/home/shin/faster-rcnn.pytorch/lib/model/rpn/rpn.py:68: UserWarning: Implicit dimension choice for softmax has been deprecated. Change the call to include dim=X as an argument.
rpn_cls_prob_reshape = F.softmax(rpn_cls_score_reshape)
/home/shin/faster-rcnn.pytorch/lib/model/faster_rcnn/faster_rcnn.py:98: UserWarning: Implicit dimension choice for softmax has been deprecated. Change the call to include dim=X as an argument.
cls_prob = F.softmax(cls_score)
Traceback (most recent call last):
File "trainval_net.py", line 335, in <module>
optimizer.step()
File "/usr/local/lib/python3.5/dist-packages/torch/optim/sgd.py", line 94, in step
buf.mul_(momentum).add_(1 - dampening, d_p)
RuntimeError: invalid argument 3: sizes do not match at /pytorch/torch/lib/THC/generated/../generic/THCTensorMathPointwise.cu:271
The training parameters are same. In fact, I train a model for 1 epoch and then resume it, this issue also happened.....
I have make sure the version is valid and I can use this model in demo.py.
it seems that, the number of categories is changed on your bigger dataset. In this case, the size would not match. One simple solution is partially loading the pre-trained model layer-by-layer.
I do not add or delete any categories in my bigger dataset. In fact, I found if I comment these two lines, everything would be ok.
@shinshiner great!
@shinshiner
Hi,
which two lines have you commented out?
The link above is just one line.
Thank you!
@wjx2 The 286 and 287 lines
Anyone know the reason ? I also encountered this problem when resume with batchsize=1 from the model trained with batchsize=64. If I keep batchsize=64, it would be fine.
@jwyang Can you reopen the issue ? Commenting two lines is not perfect, since optimizer cannot be resumed.
@Liu0329 @shinshiner hi,guys,did you fix this problem? i also encountered this problem when i want to use the pretrained model faster_rcnn_1_7_10021.pth on my own dataset,i have tried to comment these two lines
# if args.mGPUs:
# fasterRCNN = nn.DataParallel(fasterRCNN)
but it did no work, what should i do?Thank you !!!!
@Liu0329 @shinshiner hi,guys,did you fix this problem? i also encountered this problem when i want to use the pretrained model faster_rcnn_1_7_10021.pth on my own dataset,i have tried to comment these two lines
if args.mGPUs:
fasterRCNN = nn.DataParallel(fasterRCNN)
but it did no work, what should i do?Thank you !!!!
have you solved it?
I also meet this problem.
And comment these two doesn't work.