Hello guys
I wonder how to fix seed to get reproducibility of my experiments
Right now I'm using this function before the start of the training
def seed_everything(seed=42):
random.seed(seed)
os.environ['PYTHONHASHSEED'] = str(seed)
np.random.seed(seed)
torch.manual_seed(seed)
torch.cuda.manual_seed(seed)
torch.cuda.manual_seed_all(seed)
torch.backends.cudnn.deterministic = True
torch.backends.cudnn.benchmark = False
But it doesn't work.
I run training in DDP mode if it is somehow important.
Thanks in advance!
Also have the same problem without DDP mode.
What's your environment?
Could you set num workers to 0 to see if it is related to the dataloading? I had this problem before with regular pytorch and I think I solved it by setting the seed also in the dataloading, because each subprocess would have its own seed.
@awaelchli tried and failed
Is there a chance you could share a colab with a minimal example? If not I will try to reproduce with the pl_exampels this weekend when i get to it.
In my case, it is caused by dropout.
I seed everything again in the spawed process before training fix the problem basically.
you can do this in on_train_start hook
Most helpful comment
@awaelchli tried and failed