Essentially, I want to use a learning scheduler. Typically the syntax for that is:
scheduler = lr_scheduler.LambdaLR(optimizer, lr_lambda=lambda_rule)
where here I am using the LambdaLR rule. However, when optimizer is FP16_Optimizer, this throws an error:
TypeError: FP16_Optimizer is not an Optimizer
This makes total sense. If you go to the documentation for schedulers on the base class, there is this piece of code
if not isinstance(optimizer, Optimizer):
raise TypeError('{} is not an Optimizer'.format(
type(optimizer).__name__))
Now, my questions are:
1) Is there already a way of dealing with this, cause probably, I am not the first one who has this problem?
2) If not, what would be the best suggestion to implement schedulers for FP16_Optimizer? Copy the code form torch.optim and change it to work with FP16_Optimizer?
You can call the scheduler on the base non-wrapped optimizer. This works for model stuff as well.
^ Yes, that's the recommended approach: apply the LR scheduler to the Pytorch optimizer that's being wrapped.
You can do this in one of two ways.
Apply scheduler to bare optimizer, then wrap it:
scheduler = lr_scheduler.LambdaLR(pytorch_optimizer, lr_lambda=lambda_rule)
optimizer = FP16_Optimizer(optimizer, other_options...)
or wrap it, then apply the scheduler to the wrapped instance:
optimizer = FP16_Optimizer(pytorch_optimizer, other_options...)
scheduler = lr_scheduler.LambdaLR(optimizer.optimizer, lr_lambda=lambda_rule)
Note that in the second case, you need to pass optimizer.optimizer to extract the "bare" (wrapped) Pytorch optimizer.
Indeed, used the second one and it worked perfectly.
Thank you!
Most helpful comment
^ Yes, that's the recommended approach: apply the LR scheduler to the Pytorch optimizer that's being wrapped.
You can do this in one of two ways.
Apply scheduler to bare optimizer, then wrap it:
or wrap it, then apply the scheduler to the wrapped instance:
Note that in the second case, you need to pass
optimizer.optimizerto extract the "bare" (wrapped) Pytorch optimizer.