Ignite: No optimizer.zero_grad() in supervised trainer?

Created on 21 Jul 2020 · 2Comments · Source: pytorch/ignite

❓ Questions/Help/Support

Hi,

I was wondering something. In almost every explanation I find, writing training code for a pytorch network involves

a prediction step: y_pred = model(x)
calculation of loss: loss_fn(ypred, y)
calculation of the gradients using: loss.backward()
taking a step in the direction of the calculated gradient: optimizer.step()
zeroing all calculated values: optimizer.zero_grad()

In the create_supervised_trainer() function, the last step involving optimizer.zero_grad() is not included here. Is this on purpose? Or is this handled elsewere in the code?

I wan't to build my own trainer, but want to make sure wheter the zero_grad is needed or not?

question

Source

nwschurink

Most helpful comment

Ah my mistake, I looked over it!

It's like you read my mind as I was indeed intending on using the gradients accumulation. As the zero_grad is written there after the step function I was expecting to find it there as well in the create_supervised_trainer() function.

Thanks for your help! 👍

nwschurink on 21 Jul 2020

👍2

All 2 comments

@nwschurink thanks for asking. optimizer.zero_grad() is called just above:
https://github.com/pytorch/ignite/blob/68d3ba1baa70d16d7bc35771538a3213300177c4/ignite/engine/__init__.py#L97

I wan't to build my own trainer, but want to make sure wheter the zero_grad is needed or not?

yes, zero_grad is needed :)

If you would like to perform grad accumulation, it is possible to zero grads just after the step like explained here:
https://pytorch.org/ignite/faq.html#gradients-accumulation