Pytorch-cyclegan-and-pix2pix: any suggestion on how to get rid of the checkered effect

Created on 18 Jan 2018  路  14Comments  路  Source: junyanz/pytorch-CycleGAN-and-pix2pix

i'm training CycleGAN on my food photos and my sketches - the results are fascinating esp colors and shapes - if only i knew how to get rid of that checkered effect...
38744547705_046ce1d663_o
39619927772_e0c97ed1b0_o

Most helpful comment

@AllAwake Cool results!

Here is the implementation of resize-conv I used. It remove the checkerboard artifacts during early training. You may find it useful.

                          nn.Upsample(scale_factor = 2, mode='bilinear'),
                          nn.ReflectionPad2d(1),
                          nn.Conv2d(ngf * mult, int(ngf * mult / 2),
                                             kernel_size=3, stride=1, padding=0),

It should replace the ConvTranspose2d in ResnetGenerator.

All 14 comments

Very cool! Care to share what the inputs look like too?

This distill paper talks about one of the causes of the checkerboard artifacts. You can fix that issue by switching from "deconvolution" to nearest-neighbor upsampling followed by regular convolution. I think @SsnL may have implemented this at some point.

We've also noticed that sometimes the checkboard artifacts go away if you simply train long enough. Maybe try training a bit longer.

Another cause of repetitive artifacts can be that the discriminator's receptive field is too small. For some discussion on this, please see Section 4.4 and Figure 6 of the pix2pix paper. The issue is that if the discriminator looks at too myopic a region, it won't notice that textures are repeating. I think this is probably not the case in your results, but it's something to keep in mind.

thank you Phillip, great to have the path forward - checking the links!
the original images in my training sets are pretty high res but i'm bounded by GTX 1080 - both A and B DSs are around 1k each
i trained on loadSize=384, loadSize=1024 and fineSize=384, while testing on a higher res and was not that happy with the results, the original image being too pronounced...
so that's where i an now, training with loadSize=768, fineSize=384

@AllAwake Cool results!

Here is the implementation of resize-conv I used. It remove the checkerboard artifacts during early training. You may find it useful.

                          nn.Upsample(scale_factor = 2, mode='bilinear'),
                          nn.ReflectionPad2d(1),
                          nn.Conv2d(ngf * mult, int(ngf * mult / 2),
                                             kernel_size=3, stride=1, padding=0),

It should replace the ConvTranspose2d in ResnetGenerator.

Hi @SsnL and @phillipi
I currently research about lower-to-high resolutionimage. My Generator and Discriminator artichecture are same as your cycleGan. I followed your guide to replace the ConvTranspose2d but it seems the checkerboard artifacts still appear in my result. Follow the paper distill, they mentioned about resize the image (using nearest-neighbor interpolation or bilinear interpolation) and also changing something in Discriminator. Could you please tell me how we implement it to remove the checkerboard artifacts?. This is my result
Input
116_real_a
result
116_fake_b2

I haven't tried the distill tricks by myself. For discriminators, they mentioned that you can replace stride 2 conv with a regular 3x3 conv.

I haven't tried the distill tricks by myself. For discriminators, they mentioned that you can replace stride 2 conv with a regular 3x3 conv.

Does Regular 3x3 conv mean that we just need to change stride =1 in this case ?

Yeah, I guess you also need to add a downsample layer. You can look at Table 2 in the progressive gans paper.

@AllAwake Cool results!

Here is the implementation of resize-conv I used. It remove the checkerboard artifacts during early training. You may find it useful.

                          nn.Upsample(scale_factor = 2, mode='bilinear'),
                          nn.ReflectionPad2d(1),
                          nn.Conv2d(ngf * mult, int(ngf * mult / 2),
                                             kernel_size=3, stride=1, padding=0),

It should replace the ConvTranspose2d in ResnetGenerator.

Hi @SsnL
may I ask why you choose bilinear upsampling instead of nearest-neighbor one? The distill paper pointed out the result of nearest-neighbor interpolation should be better.

@jacky841102 You could try nearest neighbor. I think that it didn鈥檛 make much difference in the dataset I tried.

Also according to this http://warmspringwinds.github.io/tensorflow/tf-slim/2016/11/22/upsampling-and-image-segmentation-with-tensorflow-and-tf-slim/
transposed convolution can be initialized with bilinear filter.

@AllAwake Cool results!

Here is the implementation of resize-conv I used. It remove the checkerboard artifacts during early training. You may find it useful.

                          nn.Upsample(scale_factor = 2, mode='bilinear'),
                          nn.ReflectionPad2d(1),
                          nn.Conv2d(ngf * mult, int(ngf * mult / 2),
                                             kernel_size=3, stride=1, padding=0),

It should replace the ConvTranspose2d in ResnetGenerator.

This method works pretty well, but it would also cause unstable training, therefore I applied an exponential learning rate decay and small initial learning rate to ensure that it works well enough. I am currently doing another training and see what will happen then

@SsnL Thanks for the solution. Did anyone successfully implement it in Generators that monotonically upsample the input? In this example, for example, if you add a padding, the output will be exactly twice the size of the input;

nn.Upsample(scale_factor = 2, mode='bilinear'),
                          nn.ReflectionPad2d(1),
                          nn.Conv2d(ngf * mult, int(ngf * mult / 2),
                                             kernel_size=3, stride=1, padding=1),

@AllAwake Cool results!

Here is the implementation of resize-conv I used. It remove the checkerboard artifacts during early training. You may find it useful.

                          nn.Upsample(scale_factor = 2, mode='bilinear'),
                          nn.ReflectionPad2d(1),
                          nn.Conv2d(ngf * mult, int(ngf * mult / 2),
                                             kernel_size=3, stride=1, padding=0),

It should replace the ConvTranspose2d in ResnetGenerator.

I like how it totally gets rid of the checkerboard artefacts, but it comes at a cost of actually learning useful features. I'm trying to balance these effects.

Was this page helpful?
0 / 5 - 0 ratings