Faster-rcnn.pytorch: RuntimeError when resume a pretrained model.

Created on 30 Jun 2018  路  14Comments  路  Source: jwyang/faster-rcnn.pytorch

I want to finetune a model, but when I resume a pretrained model ,it get error below:
Called with args:
Namespace(batch_size=1, checkepoch=20, checkpoint=3557, checkpoint_interval=10000, checksession=1, class_agnostic=False, cuda='--cuda', dataset='pascal_voc', disp_interval=100, large_scale=False, lr=0.0005, lr_decay_gamma=0.1, lr_decay_step=5, mGPUs=False, max_epochs=26, net='vgg16', num_workers=0, optimizer='sgd', resume=True, save_dir='/home/smartdsp/new_home/faster-rcnn.pytorch/models', session=1, start_epoch=1, use_tfboard=False)
Using config:
{'ANCHOR_RATIOS': [0.5, 1, 2],
'ANCHOR_SCALES': [8, 16, 32],
'CROP_RESIZE_WITH_MAX_POOL': False,
'CUDA': False,
'DATA_DIR': '/home/smartdsp/new_home/faster-rcnn.pytorch/data',
'DEDUP_BOXES': 0.0625,
'EPS': 1e-14,
'EXP_DIR': 'vgg16',
'FEAT_STRIDE': [16],
'GPU_ID': 0,
'MATLAB': 'matlab',
'MAX_NUM_GT_BOXES': 20,
'MOBILENET': {'DEPTH_MULTIPLIER': 1.0,
'FIXED_LAYERS': 5,
'REGU_DEPTH': False,
'WEIGHT_DECAY': 4e-05},
'PIXEL_MEANS': array([[[102.9801, 115.9465, 122.7717]]]),
'POOLING_MODE': 'align',
'POOLING_SIZE': 7,
'RESNET': {'FIXED_BLOCKS': 1, 'MAX_POOL': False},
'RNG_SEED': 3,
'ROOT_DIR': '/home/smartdsp/new_home/faster-rcnn.pytorch',
'TEST': {'BBOX_REG': True,
'HAS_RPN': True,
'MAX_SIZE': 1000,
'MODE': 'nms',
'NMS': 0.3,
'PROPOSAL_METHOD': 'gt',
'RPN_MIN_SIZE': 16,
'RPN_NMS_THRESH': 0.7,
'RPN_POST_NMS_TOP_N': 300,
'RPN_PRE_NMS_TOP_N': 6000,
'RPN_TOP_N': 5000,
'SCALES': [600],
'SVM': False},
'TRAIN': {'ASPECT_GROUPING': False,
'BATCH_SIZE': 256,
'BBOX_INSIDE_WEIGHTS': [1.0, 1.0, 1.0, 1.0],
'BBOX_NORMALIZE_MEANS': [0.0, 0.0, 0.0, 0.0],
'BBOX_NORMALIZE_STDS': [0.1, 0.1, 0.2, 0.2],
'BBOX_NORMALIZE_TARGETS': True,
'BBOX_NORMALIZE_TARGETS_PRECOMPUTED': True,
'BBOX_REG': True,
'BBOX_THRESH': 0.5,
'BG_THRESH_HI': 0.5,
'BG_THRESH_LO': 0.0,
'BIAS_DECAY': False,
'BN_TRAIN': False,
'DISPLAY': 10,
'DOUBLE_BIAS': True,
'FG_FRACTION': 0.25,
'FG_THRESH': 0.5,
'GAMMA': 0.1,
'HAS_RPN': True,
'IMS_PER_BATCH': 1,
'LEARNING_RATE': 0.01,
'MAX_SIZE': 1000,
'MOMENTUM': 0.9,
'PROPOSAL_METHOD': 'gt',
'RPN_BATCHSIZE': 256,
'RPN_BBOX_INSIDE_WEIGHTS': [1.0, 1.0, 1.0, 1.0],
'RPN_CLOBBER_POSITIVES': False,
'RPN_FG_FRACTION': 0.5,
'RPN_MIN_SIZE': 8,
'RPN_NEGATIVE_OVERLAP': 0.3,
'RPN_NMS_THRESH': 0.7,
'RPN_POSITIVE_OVERLAP': 0.7,
'RPN_POSITIVE_WEIGHT': -1.0,
'RPN_POST_NMS_TOP_N': 2000,
'RPN_PRE_NMS_TOP_N': 12000,
'SCALES': [600],
'SNAPSHOT_ITERS': 5000,
'SNAPSHOT_KEPT': 3,
'SNAPSHOT_PREFIX': 'res101_faster_rcnn',
'STEPSIZE': [30000],
'SUMMARY_INTERVAL': 180,
'TRIM_HEIGHT': 600,
'TRIM_WIDTH': 600,
'TRUNCATED': False,
'USE_ALL_GT': True,
'USE_FLIPPED': True,
'USE_GT': False,
'WEIGHT_DECAY': 0.0005},
'USE_GPU_NMS': True}
Loaded dataset voc_2007_trainval for training
Set proposal method: gt
Appending horizontally-flipped training examples...
voc_2007_trainval gt roidb loaded from /home/smartdsp/new_home/faster-rcnn.pytorch/data/cache/voc_2007_trainval_gt_roidb.pkl
done
Preparing training data...
done
before filtering, there are 2372 images...
after filtering, there are 2372 images...
2372 roidb entries
Loading pretrained weights from data/pretrained_model/vgg16_caffe.pth
loading checkpoint /home/smartdsp/new_home/faster-rcnn.pytorch/models/vgg16/pascal_voc/vgg16_baseline/faster_rcnn_1_20_3557.pth
loaded checkpoint /home/smartdsp/new_home/faster-rcnn.pytorch/models/vgg16/pascal_voc/vgg16_baseline/faster_rcnn_1_20_3557.pth
lib/model/rpn/rpn.py:68: UserWarning: Implicit dimension choice for softmax has been deprecated. Change the call to include dim=X as an argument.
rpn_cls_prob_reshape = F.softmax(rpn_cls_score_reshape)
lib/model/faster_rcnn/faster_rcnn.py:98: UserWarning: Implicit dimension choice for softmax has been deprecated. Change the call to include dim=X as an argument.
cls_prob = F.softmax(cls_score)
/home/smartdsp/new_home/faster-rcnn.pytorch/trainval_net_finetune.py:330: UserWarning: invalid index of a 0-dim tensor. This will be an error in PyTorch 0.5. Use tensor.item() to convert a 0-dim tensor to a Python number
loss_temp += loss.data[0]
Traceback (most recent call last):

File "", line 1, in
runfile('/home/smartdsp/new_home/faster-rcnn.pytorch/trainval_net_finetune.py', wdir='/home/smartdsp/new_home/faster-rcnn.pytorch')

File "/home/smartdsp/anaconda2/lib/python2.7/site-packages/spyder/utils/site/sitecustomize.py", line 705, in runfile
execfile(filename, namespace)

File "/home/smartdsp/anaconda2/lib/python2.7/site-packages/spyder/utils/site/sitecustomize.py", line 94, in execfile
builtins.execfile(filename, *where)

File "/home/smartdsp/new_home/faster-rcnn.pytorch/trainval_net_finetune.py", line 337, in
optimizer.step()

File "/home/smartdsp/anaconda2/lib/python2.7/site-packages/torch/optim/sgd.py", line 101, in step
buf.mul_(momentum).add_(1 - dampening, d_p)

RuntimeError: Expected object of type torch.FloatTensor but found type torch.cuda.FloatTensor for argument #4 'other'

Most helpful comment

@wjx2 @babyjie57 This update is due to the new pytorch 0.4.

you can re-initialise the weights manually using this

model.load_state_dict(checkpoint['model'])
model.cuda()
optimizer = optim.SGD(model.parameters(), momentum = 0.9, weight_decy = 0.0001)
optimizer.load_state_dict(checkpoint['optimizer'])
for state in optimizer.state.values():
    for k, v in state.items():
        if isinstance(v, torch.Tensor):
            state[k] = v.cuda()

All 14 comments

@wjx2 see the error in last row. it is because the mismatch of cpu data and cpu data. use cuda when you run the code.

@wjx2, Hi, I met the same question. Have you solved it yet?

@babyjie57 yeah, I change my torch version from 0.4.0 to 0.3.0. And the problem is solved.

@wjx2 @babyjie57 This update is due to the new pytorch 0.4.

you can re-initialise the weights manually using this

model.load_state_dict(checkpoint['model'])
model.cuda()
optimizer = optim.SGD(model.parameters(), momentum = 0.9, weight_decy = 0.0001)
optimizer.load_state_dict(checkpoint['optimizer'])
for state in optimizer.state.values():
    for k, v in state.items():
        if isinstance(v, torch.Tensor):
            state[k] = v.cuda()

Additionally for others who may encounter this problem with the adam optimizer. Use this

        optimizer.load_state_dict(checkpoint['optimizer'])

        lr = optimizer.param_groups[0]['lr']
        weight_decay = optimizer.param_groups[0]['weight_decay']
        double_bias = True
        bias_decay = True

        params = []
        for key, value in dict(fasterRCNN.named_parameters()).items():
            if value.requires_grad:
                if 'bias' in key:
                    params += [{'params':[value],'lr':lr*(double_bias + 1), \
                            'weight_decay': bias_decay and weight_decay or 0}]
                else:
                    params += [{'params':[value],'lr':lr, 'weight_decay': weight_decay}]

        optimizer = torch.optim.Adam(params)

Using this, you'll ensure you are loading in the same weight decay and learning rates from the saved move. it's kind of crude, but I'm sure you'll be able to fit it in nicely.

insert it into these lines.

https://github.com/jwyang/faster-rcnn.pytorch/blob/28db6d0b313220d200b739f4e22410fbe35529f4/trainval_net.py#L286-L287

torch 0.4.0
I put these two lines before if args.resume:, and it works well.

torch 0.4.0
I put these two lines before if args.resume:, and it works well.
whats the "two lines" you said above

use the pytorch 0.3 can solve this problem.

@wjx2 @babyjie57 This update is due to the new pytorch 0.4.

you can re-initialise the weights manually using this

model.load_state_dict(checkpoint['model'])
model.cuda()
optimizer = optim.SGD(model.parameters(), momentum = 0.9, weight_decy = 0.0001)
optimizer.load_state_dict(checkpoint['optimizer'])
for state in optimizer.state.values():
    for k, v in state.items():
        if isinstance(v, torch.Tensor):
            state[k] = v.cuda()

hi,I tried this solution,
but it didn't work for me,
what is the 'model' means?

@wjx2 @babyjie57 This update is due to the new pytorch 0.4.

you can re-initialise the weights manually using this

model.load_state_dict(checkpoint['model'])
model.cuda()
optimizer = optim.SGD(model.parameters(), momentum = 0.9, weight_decy = 0.0001)
optimizer.load_state_dict(checkpoint['optimizer'])
for state in optimizer.state.values():
    for k, v in state.items():
        if isinstance(v, torch.Tensor):
            state[k] = v.cuda()

hi锛孖 tried this solution, but didn't work for me

  • python3
  • pytorch0.4.0

@wjx2 @babyjie57 This update is due to the new pytorch 0.4.

you can re-initialise the weights manually using this

model.load_state_dict(checkpoint['model'])
model.cuda()
optimizer = optim.SGD(model.parameters(), momentum = 0.9, weight_decy = 0.0001)
optimizer.load_state_dict(checkpoint['optimizer'])
for state in optimizer.state.values():
    for k, v in state.items():
        if isinstance(v, torch.Tensor):
            state[k] = v.cuda()

torch 1.0.1
cuda10.0
I just put one line: fasterRCNN.cuda() after fasterRCNN.load_state_dict(checkpoint['model']), and it works well.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

GPaolo picture GPaolo  路  5Comments

EmmaSRH picture EmmaSRH  路  4Comments

Feiyu-Zhang picture Feiyu-Zhang  路  5Comments

twangnh picture twangnh  路  5Comments

gayathrimahalingam picture gayathrimahalingam  路  3Comments