Vision: Unable to load VGG model's state dict on GPU

Created on 15 Jul 2020  路  2Comments  路  Source: pytorch/vision

I am trying a generic code to load the state-dict of a model on CPU/GPU. It works fine for other vision type models but fails for VGG models on GPU machines

Here is a sample code I am trying out :

import torch
map_location = 'cuda' if torch.cuda.is_available() else 'cpu'
model_pt_path = 'vgg13-c768596a.pth'
state_dict = torch.load(model_pt_path, map_location=map_location)

The above code fails with following error

Traceback (most recent call last):
  File "test_vgg.py", line 4, in <module>
    state_dict = torch.load(model_pt_path, map_location=map_location)
  File "/home/ubuntu/anaconda3/lib/python3.7/site-packages/torch/serialization.py", line 593, in load
    return _legacy_load(opened_file, map_location, pickle_module, **pickle_load_args)
  File "/home/ubuntu/anaconda3/lib/python3.7/site-packages/torch/serialization.py", line 773, in _legacy_load
    result = unpickler.load()
  File "/home/ubuntu/anaconda3/lib/python3.7/site-packages/torch/tensor.py", line 147, in __setstate__
    self.set_(*state)
RuntimeError: Expected object of device type cpu but got device type cuda for argument #2 'source'

I tried this same code with Densenet/Alexnet/Squeezenet models and it works fine on both GPU and CPU machines

Environment information :

OS : Ubuntu 18.0.4
PyTorch Version : 1.5.1
TorchVision Version : 0.6.1

bug models

Most helpful comment

This relates to https://github.com/pytorch/vision/issues/2068 although the error message is different, but the root issue is the same (very old serialization format).

The workaround on your side for now could be to avoid using the map_location for CUDA, and convert manually the tensors to GPU if needed (or maybe even just pass the weights on CPU to load_state_dict while the model is on CUDA, I think this will also work).

All 2 comments

To narrow this down a little: the error only happens for the vgg11 and vgg13 weights.

This relates to https://github.com/pytorch/vision/issues/2068 although the error message is different, but the root issue is the same (very old serialization format).

The workaround on your side for now could be to avoid using the map_location for CUDA, and convert manually the tensors to GPU if needed (or maybe even just pass the weights on CPU to load_state_dict while the model is on CUDA, I think this will also work).

Was this page helpful?
0 / 5 - 0 ratings

Related issues

timonbimon picture timonbimon  路  28Comments

mcleonard picture mcleonard  路  26Comments

dssa56 picture dssa56  路  60Comments

sumanthratna picture sumanthratna  路  28Comments

JingyunLiang picture JingyunLiang  路  26Comments