when i break train (i.e epoch=3),and I add "--continue_train" in command.But it failed with "no such file named lateset ..."So I just add "--epoch_count 3" 銆侷t seems work,start from epoch 3 but the curve has break relationship of previous curve that epoch1 and 2 show .continue seem isolated and no relation with previous work.
You should set --continue_train --epoch 3 --epoch_count 4 (3 is the current final epoch).
--epoch option means which weights is loaded for initialization of networks.
(Default value of this option is latest.)
--epoch_count option means the start number of epoch count.
In your case, when only --continue_train was set, latest weights were supposed to be loaded because you didn't specify --epoch option.
Then "no such file" error occurred because latest weights were not saved yet.
And when only --epoch_count 3 was set, the program started training from scratch with start epoch number 3, because you didn't set --continue_train option.
That's why your curve after epoch 3 had no relationship with epoch 1 and 2.
Thanks for update.
I wish to express my appreciation for your help.
my last train in epoch 3 ,but not finished. Then I continue"python train.py --dataroot ./datasets/maps --name maps_cyclegan --model cycle_gan --gpu_ids -1 --epoch_count 3"
I try"python train.py --dataroot ./datasets/maps --name maps_cyclegan --model cycle_gan --gpu_ids -1 --continue_train --epoch_count 3" .But show"FileNotFoundError: [Errno 2] No such file or directory: './checkpoints/maps_cyclegan/latest_net_G_A.pth'"
should I restart from epoch 1?
I break down train by "ctrL+c".Would this operation save training?
thanks
I think i forget add "--save_latest_freq" and "--save_epoch_freq" so that I can continue to the latest training
I am glad that you figured it out.
Most helpful comment
You should set
--continue_train --epoch 3 --epoch_count 4(3 is the current final epoch).--epochoption means which weights is loaded for initialization of networks.(Default value of this option is
latest.)--epoch_countoption means the start number of epoch count.In your case, when only
--continue_trainwas set, latest weights were supposed to be loaded because you didn't specify--epochoption.Then "no such file" error occurred because latest weights were not saved yet.
And when only
--epoch_count 3was set, the program started training from scratch with start epoch number 3, because you didn't set--continue_trainoption.That's why your curve after epoch 3 had no relationship with epoch 1 and 2.