Pytorch-lightning: training_step is called once for each optimizer

Created on 16 Oct 2019 · 8Comments · Source: PyTorchLightning/pytorch-lightning

If model returns multiple optimizers, a closure is created for each one of them. This results in training_step called for each optimizer which is wasteful in most cases.
In addition to this, training_step is being called with additional parameter ( optimizer id) - this is not documented and results in an exception which ( calling 2 argument function with 3 parameters )

To Reproduce
Steps to reproduce the behavior:

Create model with two or more optimizers.
training_step should take ( batch, batch_nb) as per documentation and a print statement printing batch_nb
Run trainer.fit
Exception should be thrown re call to training_step with 3 parameters
change training_step to take ( batch, batch_nb, optimizer_idx)
Run trainer.fit
Observe batch_nb printed as many times as there are optimizers.

Expected behavior

In most cases training_step can be called only once - that's efficiency improvement.
Documentation should specify the extra parameter being passed when multiple optimizers are used, to prevent the exception.

bug / fix

Source

VismantasD

Most helpful comment

Also need the feature as @L1AN0 . Same with this need (https://github.com/PyTorchLightning/pytorch-lightning/issues/29#issuecomment-611006872).

Basically, I need to change the optimizer after 5 epochs. If I define two optimizers, the training_step will be executed two times with two different optimizer_idxs. But this is not needed. I only need to call the training_step once for each step since the two stages are not overlapped.

magic282 on 23 Jun 2020

👍4

All 8 comments

this is pretty well documented and is the expected behavior.

https://williamfalcon.github.io/pytorch-lightning/LightningModule/RequiredTrainerInterface/#training_step

williamFalcon on 16 Oct 2019

Ok, I've not noticed the documentation of the idx. But what about calling the training step multiple times? I can see how this could be useful for GANs. In the case of having different optimizers for different parts of the model, this might not be desirable.

VismantasD on 16 Oct 2019

I also encountered this problem, where I only want to set different optimizers for different parts of my model, but they should all update the model in each iteration.

Can pytorch_lightening provide a way to achieve this? For example in configure_optimizers we can mark multiple optimizers as a group where they should all update parameters (so behave like a single optimizer in current pytorch_lightening implementation).

L1AN0 on 10 Feb 2020

this is pretty well documented and is the expected behavior.

https://williamfalcon.github.io/pytorch-lightning/LightningModule/RequiredTrainerInterface/#training_step

It is giving not found, can you please link the updated docs?

walbermr on 24 Apr 2020

👍1

Hi @L1AN0 ,
For doing that you can create a single optimizer with multiple optimizers inside. You can check out this link:

1. Multiple optimizers of Same type:
Here

optim.SGD([{'params': model.base.parameters()},
           {'params': model.classifier.parameters(), 'lr': 1e-3}], 
           lr=1e-2, momentum=0.9)

2. Multiple Optimizers of Different Type

class MultipleOptimizer(object):
    def __init__(*op):
        self.optimizers = op

    def zero_grad(self):
        for op in self.optimizers:
            op.zero_grad()

    def step(self):
        for op in self.optimizers:
            op.step()


opt = MultipleOptimizer(optimizer1(params1, lr=lr1), 
                        optimizer2(params2, lr=lr2))

loss.backward()
opt.zero_grad()
opt.step()

Bibyutatsu on 5 Jun 2020

Hi @L1AN0 ,
For doing that you can create a single optimizer with multiple optimizers inside. You can check out this link:

1. Multiple optimizers of _Same_ type:
Here

optim.SGD([{'params': model.base.parameters()},
           {'params': model.classifier.parameters(), 'lr': 1e-3}], 
           lr=1e-2, momentum=0.9)

2. Multiple Optimizers of _Different_ Type

class MultipleOptimizer(object):
    def __init__(*op):
        self.optimizers = op

    def zero_grad(self):
        for op in self.optimizers:
            op.zero_grad()

    def step(self):
        for op in self.optimizers:
            op.step()


opt = MultipleOptimizer(optimizer1(params1, lr=lr1), 
                        optimizer2(params2, lr=lr2))

loss.backward()
opt.zero_grad()
opt.step()

Thanks. But that won't be sufficient because the MultipleOptimizer class is not an Optimizer. For example, when pytorch lightning is creating a checkpoint, the optimizer will fail because there's no state_dict in MultipleOptimizer.

Sadly, making MultipleOptimizer an appropriate Optimizer is not trivial.

L1AN0 on 6 Jun 2020

@L1AN0 mind open a new issue if needed?