In pytorch1.7, Lib/site-packages/torchvision/utils.py line 74 ( for t in tensor ) , this code will modify the grad_fn of the tensor and become UnbindBackward, and then, it will be throw exception by "RuntimeError: Output 0 of UnbindBackward is a view and is being modified inplace. This view is the output of a function that returns multiple views. Such functions do not allow the output views to be modified inplace. You should replace the inplace operation by an out-of-place one."
import torch.nn as nn
import torchvision.utils as vutils
import torchvision.models as models
alexnet = models.alexnet()
for sub_module in alexnet.modules():
if isinstance(sub_module, nn.Conv2d):
kernels = sub_module.weight
c_out, c_int, k_w, k_h = tuple(kernels.shape)
for o_idx in range(c_out):
kernel_idx = kernels[o_idx, :, :, :].unsqueeze(1) # kernel_idx.grad_fn == UnsqueezeBackward0
img_grid = vutils.make_grid(kernel_idx, normalize=True, scale_each=True, nrow=3)
Steps to reproduce the behavior:
Traceback (most recent call last):
File "/path/to/fiel/*.py", line 20, in
img_grid = vutils.make_grid(kernel_idx, normalize=True, scale_each=True, nrow=3)
File "D:\Anaconda_data\envs\pytorch_1.7_cpu\lib\site-packages\torchvision\utils.py", line 77, in make_grid
norm_range(t, range)
File "D:\Anaconda_data\envs\pytorch_1.7_cpu\lib\site-packages\torchvision\utils.py", line 71, in norm_range
norm_ip(t, float(t.min()), float(t.max()))
File "D:\Anaconda_data\envs\pytorch_1.7_cpu\lib\site-packages\torchvision\utils.py", line 64, in norm_ip
img.clamp_(min=min, max=max)
RuntimeError: Output 0 of UnbindBackward is a view and is being modified inplace. This view is the output of a function that returns multiple views. Such functions do not allow the output views to be modified inplace. You should replace the inplace operation by an out-of-place one.
Process finished with exit code 1
Collecting environment information...
PyTorch version: 1.7.0+cpu
Is debug build: True
CUDA used to build PyTorch: Could not collect
ROCM used to build PyTorch: N/A
OS: Microsoft Windows 10 家庭中文版
GCC version: (MinGW.org GCC-6.3.0-1) 6.3.0
Clang version: Could not collect
CMake version: Could not collect
Python version: 3.6 (64-bit runtime)
Is CUDA available: False
CUDA runtime version: 10.1.105
GPU models and configuration: Could not collect
Nvidia driver version: Could not collect
cuDNN version: C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.1\bin\cudnn64_7.dll
HIP runtime version: N/A
MIOpen runtime version: N/A
Versions of relevant libraries:
[pip] numpy==1.18.1
[pip] numpydoc==0.9.2
[conda] blas 1.0 mkl defaults
[conda] mkl 2020.0 166 defaults
[conda] mkl-service 2.3.0 py37hb782905_0 defaults
[conda] mkl_fft 1.0.15 py37h14836fe_0 defaults
[conda] mkl_random 1.1.0 py37h675688f_0 defaults
[conda] numpy 1.18.1 py37h93ca92e_0 defaults
[conda] numpy-base 1.18.1 py37hc3f5095_1 defaults
[conda] numpydoc 0.9.2 py_0 defaults
That is linked to the change to for el in tensor behavior wrt to inplace ops.
In particular, this line https://github.com/pytorch/vision/blob/8c281757a0daf3a8e92cbb4bded0e5e6b389a375/torchvision/utils.py#L74 should be changed to
for i in range(tensor.size(0)): # loop over mini-batch dimension
t = tensor[i]
We can fix this in torchvision following @albanD comment, but note that you are backpropagating through visualization code, which is probably not what you want to do, and you should instead do a with torch.no_grad() in your visualization code otherwise you could have unexpected behaviors
We should probably add a @torch.no_grad() to the visualization functions to avoid this happening in user code in the future
Wrapping the whole function in torch.no_grad() will fix the issue as well.
I assumed it could be differentiated throw since I didn't saw one.
@albanD Thanks your idea. It can be solved this problem theoretically.
for i in range(tensor.size(0)): # loop over mini-batch dimension
t = tensor[i]
In addition, this line
change "range" to a local variable, so that it doesn't work. It can be changed to
for idx, _ in enumerate([jj for jj in tensor]):
t = tensor[idx]
Moreover, it's better not to use "range" as a local variable. @fmassa
@TingsongYu I agree, we should not be using range as an argument here, this is a bad name for the function.
I would be happy merging a PR that fixes both issues (with proper deprecation warning that range is deprecated, via **kwargs unpacking).
Feel free to have a go once again, it is fixed on master.