Apex: Learning Scheduler

Created on 8 Nov 2018 · 3Comments · Source: NVIDIA/apex

Essentially, I want to use a learning scheduler. Typically the syntax for that is:

scheduler = lr_scheduler.LambdaLR(optimizer, lr_lambda=lambda_rule)

where here I am using the LambdaLR rule. However, when optimizer is FP16_Optimizer, this throws an error:

TypeError: FP16_Optimizer is not an Optimizer

This makes total sense. If you go to the documentation for schedulers on the base class, there is this piece of code

        if not isinstance(optimizer, Optimizer):
            raise TypeError('{} is not an Optimizer'.format(
                type(optimizer).__name__))

Now, my questions are:

1) Is there already a way of dealing with this, cause probably, I am not the first one who has this problem?
2) If not, what would be the best suggestion to implement schedulers for FP16_Optimizer? Copy the code form torch.optim and change it to work with FP16_Optimizer?

Source

TheRevanchist

Most helpful comment

^ Yes, that's the recommended approach: apply the LR scheduler to the Pytorch optimizer that's being wrapped.

You can do this in one of two ways.
Apply scheduler to bare optimizer, then wrap it:

scheduler = lr_scheduler.LambdaLR(pytorch_optimizer, lr_lambda=lambda_rule)
optimizer = FP16_Optimizer(optimizer, other_options...)

or wrap it, then apply the scheduler to the wrapped instance:

optimizer = FP16_Optimizer(pytorch_optimizer, other_options...)
scheduler = lr_scheduler.LambdaLR(optimizer.optimizer, lr_lambda=lambda_rule)

Note that in the second case, you need to pass optimizer.optimizer to extract the "bare" (wrapped) Pytorch optimizer.

mcarilli on 10 Nov 2018

👍9 ❤1 🎉1

All 3 comments

You can call the scheduler on the base non-wrapped optimizer. This works for model stuff as well.

yaysummeriscoming on 10 Nov 2018

^ Yes, that's the recommended approach: apply the LR scheduler to the Pytorch optimizer that's being wrapped.

You can do this in one of two ways.
Apply scheduler to bare optimizer, then wrap it:

scheduler = lr_scheduler.LambdaLR(pytorch_optimizer, lr_lambda=lambda_rule)
optimizer = FP16_Optimizer(optimizer, other_options...)

or wrap it, then apply the scheduler to the wrapped instance:

optimizer = FP16_Optimizer(pytorch_optimizer, other_options...)
scheduler = lr_scheduler.LambdaLR(optimizer.optimizer, lr_lambda=lambda_rule)

Note that in the second case, you need to pass optimizer.optimizer to extract the "bare" (wrapped) Pytorch optimizer.

mcarilli on 10 Nov 2018

👍9 ❤1 🎉1

Indeed, used the second one and it worked perfectly.

Thank you!

TheRevanchist on 15 Nov 2018

Was this page helpful?

0 / 5 - 0 ratings

Related issues

Segmentation fault

lemonhu · 3Comments

strange error when distributed training

LightToYang · 4Comments

installation failed: Given no hashes to check 123 links for project 'pip': discarding no candidates

DeeDive · 4Comments

_amp_state determines whether running in distributed at import

rmrao · 4Comments

Device mismatch when using AMP with Pytorch DataParallel

michaelklachko · 4Comments