Ignite: Flexible access to state

Created on 19 Mar 2018  路  16Comments  路  Source: pytorch/ignite

The introduction of State (#80, #83) which is passed to event handlers is very welcome as it is a critical requirement. The flexibility with which state is passed around, however, could still be improved. Currently:

  1. Event handlers can only access the state of the engine issuing the event.
  2. There is no access to the state in the training and validation functions.

An argument against making state a member of the engine was that in so not doing, the state is explicitly ephemeral. However, this enforces the assumption that an engines state should only every be accessed by that engine's event handlers.

One example of where (1) is an obstacle is on attempting to identify current epoch during training. It is useful to read the current epoch on completion of validation when saving a checkpoint, in order to record the epoch. However, the current epoch is a state attribute of the training engine; currently, there is no way to access that state, except when handling a signal emitted by the training engine.

It would be convenient to have state exposed in Trainer and Evaluator as self.state so that a user may have direct access to it, outside of callbacks that are specific to the calling engine.

If state is thus exposed, it may be cleaner to also not pass engine state by default to handlers; the user could pass any state (eg. trainer.state) through the args or *kwargs of add_event_handler, instead -- if some state is needed by the event handler.

0.1.0 help wanted

All 16 comments

On this point, I was quickly trying to hack up a version of engine.save() and engine.load and found that having a state on the engine was desirable for a clean API.

For example, if we want the API engine.save() and engine.load(file_path) the engine will need to store and reload the state, which in it's current form can't be done as the state is ephemeral.

I think we should nail down where state belongs and how it's passed around before 0.1.0

cc @jasonkriss who I know has thoughts on this.

@veugene - even if state was a member of the engine, how would this give you access inside the update function, and what is the use case for this?

If the state were a member of the engine, then the update function would be the one place where it would not be easy to access unless the function is passed the state as an argument.

Elsewhere, one could do something like the following to pass state to an event handler, if the handler requires a state:


# Some arbitrary event handler that requires engine state.
def handler_using_state(state):
    print("Epoch {}".format(state.epoch))

trainer.add_event_handler(Events.EPOCH_COMPLETED, handler_using_state, trainer.state)
evaluator.add_event_handler(Events.STARTED, handler_using_state, trainer.state)

# A function, handling an event, that does not need engine state.
class some_custom_class(object):
    ...
    def reset(self):
        self._some_attribute = 0
obj = some_custom_class()

evaluator.add_event_handler(Events.COMPLETED, obj.reset)

If we move state to be on the engine, you wouldn't even need to pass it to the handler as the handler already gets a reference to the engine so can just call engine.state.

I believe this would solve point 1 above, but point 2 is still an open issue for you? Do you have a use-case for when you'd want to access the state in the update function?

Additionally (unrelated), but IMO, if saving and loading of engines is something we want (which I think would be very useful), then state cannot be ephemeral anymore and must be a property of the engine.
(of course as @veugene points out there are other issues with serializing the engine, such as the update functions etc.).

Let's wait for @jasonkriss to weigh in, otherwise would you be able to send a PR?

Do you have a use-case for when you'd want to access the state in the update function?

That really has to do with the discussion in #118 which it appears you've seen now.

would you be able to send a PR?

Yes, it shouldn't take long once I get around to it.

@alykhantejani @veugene I'm now quite convinced of attaching state to the engine and passing only the engine to event handlers. The inconvenience this causes for metrics will be obviated if we add map_fn (or something like that) as I proposed in #118.

Just to summarize, my vote is for the following changes:

  1. attach state to engine
  2. pass engine as the only argument to all event handlers
  3. pass engine, batch to update functions

What do you all think?

@jasonkriss sounds good to me

Just to summarize, my vote is for the following changes:

  1. attach state to engine
  2. pass engine as the only argument to all event handlers
  3. pass engine, batch to update functions

What do you all think?

That looks like a clean way to do this.

Could we enclose the purpose of what is State and what it is not ? Currently, it is

class State(object):
    def __init__(self, **kwargs):
        self.iteration = 0
        self.output = None
        self.batch = None
        for k, v in kwargs.items():
            setattr(self, k, v)

and we can pass everything to it, like dataloader, metrics etc.
PS: My appologizes, if I missed some discussions before

@vfdev-5 Do you mean, could we define what is expected in State?

Since State is always associated with an Engine and an Engine always has a data loader and an update function, it makes sense to expect that State should always have an output and a batch attribute. It is also intended as an object for passing around arbitrary data, as determined by the user. So allowing it to store anything seems fine.

Might there be any reason to restrict its contents? I think the definition is a good one right now -- simple and flexible. The only issue I currently expect is with saving and loading state; then, not all types of contents may be handled. That would probably be a discussion in #62. Perhaps some separation between "savable" and "volatile" contents would be useful, once we get to engine/state saving.

I agree with everything @veugene said. We may need to set some restrictions on what can go in state for the purposes of serialization but let's defer that discussion to #62.

+1 to @veugene's summary

I have a little impression that State is a serializable version of Engine :)

OK I'll go ahead and prepare a PR.

Fixed in #123

Was this page helpful?
0 / 5 - 0 ratings