Pytorch-lightning: Gradient accumulation fails with fp16 precision

Created on 29 Oct 2020  路  3Comments  路  Source: PyTorchLightning/pytorch-lightning

馃悰 Bug

Setting accumulate_grad_batches > 1 and precision = 16 causes the following error:

RuntimeError: unscale_() has already been called on this optimizer since the last update().

Please reproduce using the BoringModel and post here

https://colab.research.google.com/drive/1_7pxqPlpc79k0VYlRdtRXE0JQbhSBWHy?usp=sharing

Environment

  • CUDA:

    • GPU:



      • Tesla T4



    • available: True

    • version: 10.1

  • Packages:

    • numpy: 1.18.5

    • pyTorch_debug: False

    • pyTorch_version: 1.6.0+cu101

    • pytorch-lightning: 1.0.4

    • tqdm: 4.41.1

  • System:

    • OS: Linux

    • architecture:



      • 64bit


      • -


    • processor: x86_64

    • python: 3.6.9

    • version: #1 SMP Thu Jul 23 08:00:38 PDT 2020

bug / fix help wanted

Most helpful comment

I'm only getting this error on 1.0.4 (after I upgraded a few hours ago). I downgraded to 1.0.3 and there is not there.

All 3 comments

Thanks for taking this @ydcjeff!

I'm only getting this error on 1.0.4 (after I upgraded a few hours ago). I downgraded to 1.0.3 and there is not there.

Was this page helpful?
0 / 5 - 0 ratings