Hi, we often apply cycleGAN for unpaired data. So, some of the performance metric will be not applied
For my dataset, I would like to use cyclegan to mapping an image from winter session to spring session and they have no pair data for each image. Could you tell me how can I evaluate the cyclegan performance (i.e how to know the output is close to a realistic image...)
A few choices: (1) we often evaluate CycleGAN on paired datasets (e.g., cityscapes used in the paper) while the model is trained without using pairs. (2) some folks have used standard GAN metrics such as FID (3) As no metrics are perfect, a user study might be helpful. You can check out the details of the user study in the CycleGAN paper.
Thanks for suggesting the FID. If two dataset are unpaired such as dataset 1 is synthetic of spring and dataset 2 is real spring. They are unpaired, could I use the FID as a performance metric.
People just calculate FID between generated results (fake spring) and target real images (e.g., real spring). FID is not perfect. It doesn't capture the conditioning (i.e. alignment between output and input). But it captures the marginal distribution.
May we know the FID in those datasets for pix2pix?
We don't have FID numbers for pix2pix. FID was published after pix2ix and became popular much later. If you want to compare to pix2pix regarding FID, I recommend that you download our pre-trained models and use your FID evaluation code. There are multiple versions of FID code, which produce slightly different results. Just make sure that you use the version for your method and pix2pix.
We don't have FID numbers for pix2pix. FID was published after pix2ix and became popular much later. If you want to compare to pix2pix regarding FID, I recommend that you download our pre-trained models and use your FID evaluation code. There are multiple versions of FID code, which produce slightly different results. Just make sure that you use the version for your method and pix2pix.
Thanks for your quick response! I got it.
Most helpful comment
A few choices: (1) we often evaluate CycleGAN on paired datasets (e.g., cityscapes used in the paper) while the model is trained without using pairs. (2) some folks have used standard GAN metrics such as FID (3) As no metrics are perfect, a user study might be helpful. You can check out the details of the user study in the CycleGAN paper.