Pytorch-cyclegan-and-pix2pix: Error(s) in loading state_dict for ResnetGenerator

Created on 19 Jun 2018 · 20Comments · Source: junyanz/pytorch-CycleGAN-and-pix2pix

Today, I want to test my trained model. There are some errors not occur before.

Traceback (most recent call last):
File "test.py", line 19, in
model.setup(opt)
File "/home/t-fayan/vision/pytorch-CycleGAN-and-pix2pix/models/base_model.py", line 43, in setup
self.load_networks(opt.which_epoch)
File "/home/t-fayan/vision/pytorch-CycleGAN-and-pix2pix/models/base_model.py", line 130, in load_networks
net.load_state_dict(state_dict)
File "/home/t-fayan/anaconda2/envs/py27/lib/python2.7/site-packages/torch/nn/modules/module.py", line 721, in load_state_dict
self.__class__.__name__, "\n\t".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for ResnetGenerator:
Missing key(s) in state_dict: "model.10.conv_block.6.bias", "model.10.conv_block.6.weight", "model.10.conv_block.7.running_var", "model.10.conv_block.7.running_mean", "model.11.conv_block.6.bias", "model.11.conv_block.6.weight", "model.11.conv_block.7.running_var", "model.11.conv_block.7.running_mean", "model.12.conv_block.6.bias", "model.12.conv_block.6.weight", "model.12.conv_block.7.running_var", "model.12.conv_block.7.running_mean", "model.13.conv_block.6.bias", "model.13.conv_block.6.weight", "model.13.conv_block.7.running_var", "model.13.conv_block.7.running_mean", "model.14.conv_block.6.bias", "model.14.conv_block.6.weight", "model.14.conv_block.7.running_var", "model.14.conv_block.7.running_mean", "model.15.conv_block.6.bias", "model.15.conv_block.6.weight", "model.15.conv_block.7.running_var", "model.15.conv_block.7.running_mean", "model.16.conv_block.6.bias", "model.16.conv_block.6.weight", "model.16.conv_block.7.running_var", "model.16.conv_block.7.running_mean", "model.17.conv_block.6.bias", "model.17.conv_block.6.weight", "model.17.conv_block.7.running_var", "model.17.conv_block.7.running_mean", "model.18.conv_block.6.bias", "model.18.conv_block.6.weight", "model.18.conv_block.7.running_var", "model.18.conv_block.7.running_mean".
Unexpected key(s) in state_dict: "model.10.conv_block.5.weight", "model.10.conv_block.5.bias", "model.10.conv_block.6.running_mean", "model.10.conv_block.6.running_var", "model.11.conv_block.5.weight", "model.11.conv_block.5.bias", "model.11.conv_block.6.running_mean", "model.11.conv_block.6.running_var", "model.12.conv_block.5.weight", "model.12.conv_block.5.bias", "model.12.conv_block.6.running_mean", "model.12.conv_block.6.running_var", "model.13.conv_block.5.weight", "model.13.conv_block.5.bias", "model.13.conv_block.6.running_mean", "model.13.conv_block.6.running_var", "model.14.conv_block.5.weight", "model.14.conv_block.5.bias", "model.14.conv_block.6.running_mean", "model.14.conv_block.6.running_var", "model.15.conv_block.5.weight", "model.15.conv_block.5.bias", "model.15.conv_block.6.running_mean", "model.15.conv_block.6.running_var", "model.16.conv_block.5.weight", "model.16.conv_block.5.bias", "model.16.conv_block.6.running_mean", "model.16.conv_block.6.running_var", "model.17.conv_block.5.weight", "model.17.conv_block.5.bias", "model.17.conv_block.6.running_mean", "model.17.conv_block.6.running_var", "model.18.conv_block.5.weight", "model.18.conv_block.5.bias", "model.18.conv_block.6.running_mean", "model.18.conv_block.6.running_var".

What does it mean? Also, if I test pretrained model like horse2zebra, these errors occur too. But I didn't encounter these errors before.

Source

yfnn

Most helpful comment

I've had similar issue while trying to test CycleGAN (--model test) on my own dataset.
Default --norm instance was used for both training and testing.
Deleting the root project directory and cloning this repository again did not help.

I've noticed that the problem is wrong keys in the state_dict dictionary.
E.g. model tries to load missing key "model.10.conv_block.6.weight", however there is unexpected key "model.10.conv_block.5.weight". So, I decided to fix these wrong keys.

Based on my error message I have created two lists (missing_list and expected_list). Afterwards I've replaced wrong keys with corresponding correct ones.

Example of my snippet is here. I've inserted it after line 135 here.

dovletov on 29 Nov 2018

👍10 🎉2

All 20 comments

Facing same issue

rawalkhirodkar on 21 Jun 2018

I deleted the root project directory and clone this repository again. Then, it works. I don't know the reason. This is just a fast method to solve this issue for me.

yfnn on 21 Jun 2018

Yes, please check out the latest commit.

junyanz on 3 Jul 2018

I face the same problem.
I'd like to run a CycleGAN pre-trained model.
However, " RuntimeError : Unexpected key (s) in state_dict " occurs. Please help me.

JungJungyeji on 11 Jul 2018

I face the same issue. I've trained pix2pix model with the previous version of the code and tried to test it using the older and the latest commit and got the same "missing keys in state_dict" error in both.

mailengm on 12 Jul 2018

Could you check if you have used the same normalization (batchnorm, instancenorm) during training and test?

junyanz on 13 Jul 2018

👍2

Faced the same error.

screen shot 2018-07-12 at 10 42 24 pm

hao44le on 13 Jul 2018

I used the default normalization (instancenorm) during training and test.
I was able to solve the problem downloading the newest version of the code and training the model again.

mailengm on 13 Jul 2018

I faced same error on applying a pre-train model (cyclegan) in the newest version.
Is normalization(instancenorm) also related to this case?

pencilrocketman on 13 Jul 2018

Sorry, I solved this problem by correct docker setting.
When I used pytorch-nightly instead of pytorch, I got good results.
This is my dockerfile.(forgive my dockerfile that is dirty)

FROM nvidia/cuda:9.0-cudnn7-devel

RUN apt-get update && apt-get install -y \
   build-essential curl wget git cmake vim pkg-config unzip libgtk2.0-dev python3 python3-pip \
   imagemagick graphviz > /dev/null

# Miniconda3
ENV PATH /opt/conda/bin:$PATH
ENV LB_LIBRARY_PATH /opt/conda/lib:$LB_LIBRARY_PATH
RUN curl -Ls https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh -o /tmp/install-miniconda.sh && \
   /bin/bash /tmp/install-miniconda.sh -b -p /opt/conda && \
   conda update -n base conda && \
   conda update --all -y

# Basic dependencies
RUN conda config --add channels conda-forge
RUN conda install -y readline mkl openblas numpy scipy hdf5 \
   pillow matplotlib cython pandas gensim protobuf \
   lmdb leveldb boost jupyterlab
RUN pip install pydot_ng nnpack h5py scikit-learn scikit-image hyperdash backports.ssl_match_hostname

# OpenCV
RUN conda install opencv3 -c menpo -y
RUN conda install dominate bz2file visdom

# PyTorch
RUN conda install pytorch-nightly torchvision cuda90 -c pytorch -y

# For CycleGAN and pix2pix
ADD ./vision /vision
WORKDIR /vision
RUN python3 setup.py install
WORKDIR /

pencilrocketman on 13 Jul 2018

👍1

@taesung89

junyanz on 13 Jul 2018

The issue with unexpected key: num_batches_tracked should be fixed by the latest commit.
Regarding the first error on this thread, I believe it's because PyTorch's default setting has changed. Could you get the latest pytorch and try again?

taesungp on 14 Jul 2018

👍1

Thank you very much. I downloaded the new version of the code and fixed the problem.

JungJungyeji on 16 Jul 2018

Could you check if you have used the same normalization (batchnorm, instancenorm) during training and test?

Thank you @junyanz This resolved it for me.

kakumarabhishek on 13 Nov 2018

Based on my error message I have created two lists (missing_list and expected_list). Afterwards I've replaced wrong keys with corresponding correct ones.

Example of my snippet is here. I've inserted it after line 135 here.

dovletov on 29 Nov 2018

👍10 🎉2

Another solution to this is to modify net.load_state_dict(state_dict, strict=False) where I added the strict=False option. This allows the network to load weights as long as the sizes and the number of parameters fit, even if the key-names aren't exact.

vis-opt on 19 Sep 2019

👍4 ❤2

I faced the same issue. But I added "--no_dropout" when I tested, the issue was gone. As follows:
python test.py --no_dropout

SunLeL on 31 May 2020

👍4

I faced the same issue. But I added "--no_dropout" when I tested, the issue was gone. As follows:
python test.py --no_dropout

Thank you @SunLeL ,it works for me!!!!

Songtingt on 28 Aug 2020

I faced the same issue. But I added "--no_dropout" when I tested, the issue was gone. As follows:
python test.py --no_dropout

Amazing! I found when I use unet_256, it is ok. The error happens when I use resnet_6blocks.
By the way, use --no_dropout works for me!!!

anxingle on 10 Dec 2020

Another solution to this is to modify net.load_state_dict(state_dict, strict=False) where I added the strict=False option. This allows the network to load weights as long as the sizes and the number of parameters fit, even if the key-names aren't exact.

Yes, this method works. One has to make the changes in models/base_model.py file.
Btw, --no_dropout also works. But the results are different.