Pytorch-cyclegan-and-pix2pix: Question: batch size

Created on 15 May 2017  路  3Comments  路  Source: junyanz/pytorch-CycleGAN-and-pix2pix

The paper indicates that training was done with batch size = 1

Is there a reason not to use a slightly larger batch size to more fully occupy the GPUs? For example, are the results better with batch size = 1 than with batch sizes larger than 1?

Most helpful comment

Pix2pix training on convergence curves on Facades dataset using batch sizes of 1, 16, 32, respectively.

pix2pix_facades_batch_size_1
pix2pix_facades_batch_size_16
pix2pix_facades_batch_size_32

All 3 comments

We haven't compared the quality of results with different batch sizes. It would be great if someone can look at it. We use batchSize=1 mainly because we would like to train a model on images with higher resolution.

Pix2pix training on convergence curves on Facades dataset using batch sizes of 1, 16, 32, respectively.

pix2pix_facades_batch_size_1
pix2pix_facades_batch_size_16
pix2pix_facades_batch_size_32

Thanks! Also note that batchsize=1 is instance norm (aka contrast normalization), which has qualitatively different properties from batchnorm. Batchnorm achieves invariance to mean and variance of features across a bunch of images. Instance norm achieves invariance to mean and variance of features in a single image. As a result, instance norm will be (nearly) invariant to image-level operations like changing the exposure or contrast of a photo, whereas batchnorm will not. Batchnorm is only invariant to batch-level operations.

*caveat, these statements are only strictly true if the momentum parameter is set to zero, which we don't do in practice

Was this page helpful?
0 / 5 - 0 ratings