Pytorch-lightning: Gpus=1 + precision not working when using only certain layers

Created on 4 Nov 2020  ·  3Comments  ·  Source: PyTorchLightning/pytorch-lightning

🐛 Bug

Making a finetuning model where the backbone isn't training breaks 16-bit.

3rd-party Priority P0 bug / fix help wanted

Most helpful comment

Hi! thanks for your contribution!, great first issue!

All 3 comments

Hi! thanks for your contribution!, great first issue!

Just to keep all the details here, this seems to be a side effect of amp. When we call self.trainer.scaler.step(optimizer) internally the scaler does an inf check on the optimizer's parameters, which is the assertion being thrown. This check needs to coincide with ensuring that the parameters are even updated within this step.

@SeanNaren follow up with pytorch team

Was this page helpful?
0 / 5 - 0 ratings

Related issues

versatran01 picture versatran01  ·  3Comments

edenlightning picture edenlightning  ·  3Comments

awaelchli picture awaelchli  ·  3Comments

as754770178 picture as754770178  ·  3Comments

srush picture srush  ·  3Comments