Pytorch-cyclegan-and-pix2pix: About Strides of PatchGAN Discriminator

Created on 10 Dec 2017 · 2Comments · Source: junyanz/pytorch-CycleGAN-and-pix2pix

Hi. I've been trying to reimplement CycleGAN architecture without taking a look at any code (torch, PyTorch, Tensorflow etc.). What I'm gonna ask is; I couldn't find any reference for the stride of discriminator's last layer.
In the file pix2pix/scripts/receptive_field_sizes.m, you have,

% fix the output size to 1 and derive the receptive field in the input
out = ...
f(f(f(f(f(1, 4, 1), ...   % conv4 -> conv5
             4, 1), ...   % conv3 -> conv4
             4, 2), ...   % conv2 -> conv3
             4, 2), ...   % conv1 -> conv2
             4, 2);       % input -> conv1

fprintf('n=3 discriminator receptive field size: %d\n', out);

It says the last TWO layers have stride = 1.
Also in the file pytorch-CycleGAN-and-pix2pix/models/networks.py, starting from line 413, you have,

        nf_mult_prev = nf_mult
        nf_mult = min(2**n_layers, 8)
        sequence += [
            nn.Conv2d(ndf * nf_mult_prev, ndf * nf_mult,
                      kernel_size=kw, stride=1, padding=padw, bias=use_bias),
            norm_layer(ndf * nf_mult),
            nn.LeakyReLU(0.2, True)
        ]

        sequence += [nn.Conv2d(ndf * nf_mult, 1, kernel_size=kw, stride=1, padding=padw)]

Both the script and code suggest that discriminator's last convolutional layer (i.e. 256, 512) AND the layer after that, namely, "mapper convolutional layer" have their strides set to 1.

I couldn't find any reference to this operation in both ConditionalGAN and CycleGAN papers, other than

After the last layer, a convolution is applied to map to a 1 dimensional output, followed by a Sigmoid function.

which only tells me that after the 'c512' layer, I have to add a convolutional layer to map features to 1 dimensional output. Since I haven't had a look on any code until now, it has been overwhelming for me to understand the architecture. Am I missing something in the paper ?

Thank you for your time.

Source

onursertkaya

❤1 👍1

Most helpful comment

Hi @onursertkaya,

Sorry it's been hard to follow. You are right that the last two layers both have stride 1. Reading over the appendix of the pix2pix paper it looks like we indeed failed to mention this. I'll update it in the next arxiv draft.

As you continue working on your reimplementation, I would suggest looking at our code, rather than trying to reimplement directly from the papers. There are probably going to be more details that are in the code but not mentioned in the paper (although we tried to minimize this). My own perspective is that the "scientific publication" should not be thought of as just the paper, but the paper+code+data. For learning about the basic idea and math, the paper is the place to look. For reimplementing the exact method, I would say the code is the primary place to look.