Pytorch: Feature Request: load_state_dict should take filenames

Created on 31 May 2017  路  3Comments  路  Source: pytorch/pytorch

In high memory pressure situations, the following is a common occurrence:

  1. create model
  2. read state_dict from checkpoint file (loads on GPU)
  3. model.load_state_dict(s)

Because of memory pressure, a common workaround is to first do:

s = torch.load('my_file.pt', map_location=lambda storage, loc: storage)

And then load s into model.

This is a very common scenario that we should be able to avoid, and this scenario might have some pitfalls: what happens on part-GPU part-CPU models, what happens on multi-GPU models...

if load_state_dict took a filename directly, it can delete it's existing parameter storages and set them to the new one on the fly, thereby requiring no extra memory.

feature nn triaged

Most helpful comment

If load_state_dict takes a filename we should also allow for the map_location param too. A common situation for me is to save a checkpoint on cluster machine and then load it on my macbook (so need to load params onto CPU)

All 3 comments

the same applies to optimizer state_dicts. for some optimizers like Adagrad, the checkpoints are large, and we can have the same memory pressure situation. optimizers dont even have a .cuda(), so we manually first have to load state_dict onto CPU, and then manually copy over parts to the GPU.

I ran into this while helping @aszlam today.

If load_state_dict takes a filename we should also allow for the map_location param too. A common situation for me is to save a checkpoint on cluster machine and then load it on my macbook (so need to load params onto CPU)

Me and @szagoruyko are fans of HDF5 format for serialized models, maybe if it could get along nicely with this proposal

Was this page helpful?
0 / 5 - 0 ratings

Related issues

a1363901216 picture a1363901216  路  3Comments

bartolsthoorn picture bartolsthoorn  路  3Comments

mishraswapnil picture mishraswapnil  路  3Comments

bartvm picture bartvm  路  3Comments

kdexd picture kdexd  路  3Comments