I just have a question regarding the training and test data of CycleGan.
Do you think the test data must belong to the same domain as that of the training data(for synthetic dataset), when I mean domain, I am talking about the backgrounds both of them share. When I give the test data as the drone with different background, its not producing the same quality of results as it does when given a test image with a drone that shares similar background as seen in the training dataset.
Is this result correct, should we make sure that the backgrounds should remain as similar to each other as possible ?
Please let me know.
The result obtained when the background is same :

the result obtained when the background is different :

Input synthetic image (trainA dataset) :

You are correct. The test data should be similar to training data. I recommend that you collect additional training data or apply additional data augmentation. (e.g., different kinds of cropping/scaling)
Thank you so much for the prompt response Professor. I just have another follow up question, when do you think I can stop the training process ? Is human involvement/ observation required or are there any other methods to do it like keeping track of any particular loss function at their minimum, any other metrics etc ?
Please let me know. Thank you .
If you have a downstream task, you can evaluate the performance of your model regarding the task. Otherwise, it requires either (1) manual inspection to choose the best model, or (2) standard GAN metrics (e..g, FID)
Thanks alot Professor. This clarifies alot of questions I had.
Hi, I just started playing with the horse2zebra dataset. I am a little new to GANs. In usual ML models, I am used to training data having a one-one correspondence. That is, if horse pics imageA001 to imageA009.jpg are in folder trainA, then imageB001 to imageB009.jpg should be the corresponding zebra images with the same background and so on, just the horse body replaced with a ditto zebra body. But then I don't see this kind of a one-one labeling in the train data folders. Is such a one-one labelling unimportant for GANs?
It depends on the type of GAN models. For CycleGAN, we don't need one-to-one correspondence. For pix2pix, we need.
Thanks!
In your tutorial ipynb, I ran the following command to train my model (I know n_epochs is too small, but this is more of a sanity check whether things work, as each epoch takes 400 s on my Google Colab single GPU).
!python train.py --dataroot ./datasets/horse2zebra --name horse2zebra --model cycle_gan --gpu_ids 0 --n_epochs 1 --n_epochs_decay 1 --display_id 0
This is your horse2zebra dataset, nothing new from my side.
But then I don't generate ./checkpoints/horse2zebra/latest_net_G_A.pth as expected. In fact there is no .pth file generated. Whereas the pretrained folder ./checkpoints/horse2zebra_pretrained has this pth file.
By default, your models will be saved every --save_latest_freq iterations (default 5000) or every --save_epoch_freq (default 5) epoches. In your case, as you only trained your model for 2 epoches, none of them would be saved.
Most helpful comment
If you have a downstream task, you can evaluate the performance of your model regarding the task. Otherwise, it requires either (1) manual inspection to choose the best model, or (2) standard GAN metrics (e..g, FID)