Pytorch-lightning: Log epoch as step when on_epoch=True and on_step=False

Created on 27 Aug 2020  路  5Comments  路  Source: PyTorchLightning/pytorch-lightning

馃殌 Feature

When using the new structured Result API, it is no longer possible to force PL to report the epoch as the current step to loggers instead of the global step (or elapsed number of training steps).

Motivation


This results in confusing results when viewing results in a tool that uses step count by default (e.g. tensorboard's scalars view). Intuitively, one would think .log(..., on_epoch=True, on_step=False) would count epochs and not steps.

It is possible to obtain this behaviour by overriding both training_epoch_end _and_ validation_epoch_end and returning an EvalResult with a "step" metric. Unfortunately, this adds back a bunch of boilerplate and many of the nice metric aggregation features PL offers when *_epoch_end is not implemented.

Pitch


Either a) allow for overriding the step for a given result, or b) default that step to current_epoch when on_epoch=True and on_step=False.

Alternatives

  1. Use a "step" key with the old dict-based logging system (not documented, but worked as of 0.8.3)
  2. Override train/validation_epoch_end and add return an EvalResult with .log('step', self.current_epoch).

Additional context


Original discussion: https://forums.pytorchlightning.ai/t/is-there-a-way-to-only-log-on-epoch-end-using-the-new-result-apis/74/3

ResultObj enhancement help wanted

Most helpful comment

is this behavior really desired? When graphing, don't you want everything to be on the same scale?
if this change happens, the metrics logged by epoch won't be able to be compared visually with things logged per step...

makes it really hard to compare apples to apples.

I don't think we should make this change.
@PyTorchLightning/core-contributors ?

All 5 comments

Hi! thanks for your contribution!, great first issue!

any ideas for the case on_epoch=True and on_step=True?

Edit:
looking through the code it would be simple enough to make on_epoch=True to log using epochs but that wouldn't allow it to keep using steps when on_step is also True, perhaps they should be made mutually exclusive? I can't think of a situation where I would log both on the same graph even if they are both using steps for x.

Edit 2:
nvm they are logged to two different keys so there should be no conflict, simple 1 line change

is this behavior really desired? When graphing, don't you want everything to be on the same scale?
if this change happens, the metrics logged by epoch won't be able to be compared visually with things logged per step...

makes it really hard to compare apples to apples.

I don't think we should make this change.
@PyTorchLightning/core-contributors ?

As per the discourse discussion, logs could be on epoch only xor step only so everything would be on the same scale (modulo whatever ephemeral logs show up on the progress bar). However, it's not possible now if one wants to log _everything_ on epoch without overriding *_epoch_end.

This changed with new API for logging. Please upgrade to 1.0.2. Feel free to reopen if needed.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

iakremnev picture iakremnev  路  3Comments

srush picture srush  路  3Comments

Vichoko picture Vichoko  路  3Comments

chuong98 picture chuong98  路  3Comments

remisphere picture remisphere  路  3Comments