Pytorch-lightning: training_step is called once for each optimizer

Created on 16 Oct 2019  路  8Comments  路  Source: PyTorchLightning/pytorch-lightning

If model returns multiple optimizers, a closure is created for each one of them. This results in training_step called for each optimizer which is wasteful in most cases.
In addition to this, training_step is being called with additional parameter ( optimizer id) - this is not documented and results in an exception which ( calling 2 argument function with 3 parameters )

To Reproduce
Steps to reproduce the behavior:

  1. Create model with two or more optimizers.
  2. training_step should take ( batch, batch_nb) as per documentation and a print statement printing batch_nb
  3. Run trainer.fit
  4. Exception should be thrown re call to training_step with 3 parameters
  5. change training_step to take ( batch, batch_nb, optimizer_idx)
  6. Run trainer.fit
  7. Observe batch_nb printed as many times as there are optimizers.

Expected behavior

  • In most cases training_step can be called only once - that's efficiency improvement.
  • Documentation should specify the extra parameter being passed when multiple optimizers are used, to prevent the exception.
bug / fix

Most helpful comment

Also need the feature as @L1AN0 . Same with this need (https://github.com/PyTorchLightning/pytorch-lightning/issues/29#issuecomment-611006872).

Basically, I need to change the optimizer after 5 epochs. If I define two optimizers, the training_step will be executed two times with two different optimizer_idxs. But this is not needed. I only need to call the training_step once for each step since the two stages are not overlapped.

All 8 comments

Ok, I've not noticed the documentation of the idx. But what about calling the training step multiple times? I can see how this could be useful for GANs. In the case of having different optimizers for different parts of the model, this might not be desirable.

I also encountered this problem, where I only want to set different optimizers for different parts of my model, but they should all update the model in each iteration.

Can pytorch_lightening provide a way to achieve this? For example in configure_optimizers we can mark multiple optimizers as a group where they should all update parameters (so behave like a single optimizer in current pytorch_lightening implementation).

this is pretty well documented and is the expected behavior.

https://williamfalcon.github.io/pytorch-lightning/LightningModule/RequiredTrainerInterface/#training_step

It is giving not found, can you please link the updated docs?

Hi @L1AN0 ,
For doing that you can create a single optimizer with multiple optimizers inside. You can check out this link:

1. Multiple optimizers of Same type:
Here

optim.SGD([{'params': model.base.parameters()},
           {'params': model.classifier.parameters(), 'lr': 1e-3}], 
           lr=1e-2, momentum=0.9)

2. Multiple Optimizers of Different Type

class MultipleOptimizer(object):
    def __init__(*op):
        self.optimizers = op

    def zero_grad(self):
        for op in self.optimizers:
            op.zero_grad()

    def step(self):
        for op in self.optimizers:
            op.step()


opt = MultipleOptimizer(optimizer1(params1, lr=lr1), 
                        optimizer2(params2, lr=lr2))

loss.backward()
opt.zero_grad()
opt.step()

Hi @L1AN0 ,
For doing that you can create a single optimizer with multiple optimizers inside. You can check out this link:

1. Multiple optimizers of _Same_ type:
Here

optim.SGD([{'params': model.base.parameters()},
           {'params': model.classifier.parameters(), 'lr': 1e-3}], 
           lr=1e-2, momentum=0.9)

2. Multiple Optimizers of _Different_ Type

class MultipleOptimizer(object):
    def __init__(*op):
        self.optimizers = op

    def zero_grad(self):
        for op in self.optimizers:
            op.zero_grad()

    def step(self):
        for op in self.optimizers:
            op.step()


opt = MultipleOptimizer(optimizer1(params1, lr=lr1), 
                        optimizer2(params2, lr=lr2))

loss.backward()
opt.zero_grad()
opt.step()

Thanks. But that won't be sufficient because the MultipleOptimizer class is not an Optimizer. For example, when pytorch lightning is creating a checkpoint, the optimizer will fail because there's no state_dict in MultipleOptimizer.

Sadly, making MultipleOptimizer an appropriate Optimizer is not trivial.

@L1AN0 mind open a new issue if needed?

Also need the feature as @L1AN0 . Same with this need (https://github.com/PyTorchLightning/pytorch-lightning/issues/29#issuecomment-611006872).

Basically, I need to change the optimizer after 5 epochs. If I define two optimizers, the training_step will be executed two times with two different optimizer_idxs. But this is not needed. I only need to call the training_step once for each step since the two stages are not overlapped.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

iakremnev picture iakremnev  路  3Comments

mmsamiei picture mmsamiei  路  3Comments

justusschock picture justusschock  路  3Comments

williamFalcon picture williamFalcon  路  3Comments

jcreinhold picture jcreinhold  路  3Comments