If model returns multiple optimizers, a closure is created for each one of them. This results in training_step called for each optimizer which is wasteful in most cases.
In addition to this, training_step is being called with additional parameter ( optimizer id) - this is not documented and results in an exception which ( calling 2 argument function with 3 parameters )
To Reproduce
Steps to reproduce the behavior:
Expected behavior
this is pretty well documented and is the expected behavior.
Ok, I've not noticed the documentation of the idx. But what about calling the training step multiple times? I can see how this could be useful for GANs. In the case of having different optimizers for different parts of the model, this might not be desirable.
I also encountered this problem, where I only want to set different optimizers for different parts of my model, but they should all update the model in each iteration.
Can pytorch_lightening provide a way to achieve this? For example in configure_optimizers we can mark multiple optimizers as a group where they should all update parameters (so behave like a single optimizer in current pytorch_lightening implementation).
this is pretty well documented and is the expected behavior.
It is giving not found, can you please link the updated docs?
Hi @L1AN0 ,
For doing that you can create a single optimizer with multiple optimizers inside. You can check out this link:
1. Multiple optimizers of Same type:
Here
optim.SGD([{'params': model.base.parameters()},
{'params': model.classifier.parameters(), 'lr': 1e-3}],
lr=1e-2, momentum=0.9)
2. Multiple Optimizers of Different Type
class MultipleOptimizer(object):
def __init__(*op):
self.optimizers = op
def zero_grad(self):
for op in self.optimizers:
op.zero_grad()
def step(self):
for op in self.optimizers:
op.step()
opt = MultipleOptimizer(optimizer1(params1, lr=lr1),
optimizer2(params2, lr=lr2))
loss.backward()
opt.zero_grad()
opt.step()
Hi @L1AN0 ,
For doing that you can create a single optimizer with multiple optimizers inside. You can check out this link:1. Multiple optimizers of _Same_ type:
Hereoptim.SGD([{'params': model.base.parameters()}, {'params': model.classifier.parameters(), 'lr': 1e-3}], lr=1e-2, momentum=0.9)2. Multiple Optimizers of _Different_ Type
class MultipleOptimizer(object): def __init__(*op): self.optimizers = op def zero_grad(self): for op in self.optimizers: op.zero_grad() def step(self): for op in self.optimizers: op.step() opt = MultipleOptimizer(optimizer1(params1, lr=lr1), optimizer2(params2, lr=lr2)) loss.backward() opt.zero_grad() opt.step()
Thanks. But that won't be sufficient because the MultipleOptimizer class is not an Optimizer. For example, when pytorch lightning is creating a checkpoint, the optimizer will fail because there's no state_dict in MultipleOptimizer.
Sadly, making MultipleOptimizer an appropriate Optimizer is not trivial.
@L1AN0 mind open a new issue if needed?
Also need the feature as @L1AN0 . Same with this need (https://github.com/PyTorchLightning/pytorch-lightning/issues/29#issuecomment-611006872).
Basically, I need to change the optimizer after 5 epochs. If I define two optimizers, the training_step will be executed two times with two different optimizer_idxs. But this is not needed. I only need to call the training_step once for each step since the two stages are not overlapped.
Most helpful comment
Also need the feature as @L1AN0 . Same with this need (https://github.com/PyTorchLightning/pytorch-lightning/issues/29#issuecomment-611006872).
Basically, I need to change the optimizer after 5 epochs. If I define two optimizers, the training_step will be executed two times with two different optimizer_idxs. But this is not needed. I only need to call the training_step once for each step since the two stages are not overlapped.