Ignite: process_function in Engine has strong signature?

Created on 21 Nov 2019  Â·  7Comments  Â·  Source: pytorch/ignite

I checked all docs, but I can't understand if we have any restriction for process_function for Engine.

For example, I want to resolve the following questions:

  • Must my process_function has engine as a first argument? If it is true, then it would be great to notice about it in docs.
  • Can I use another signature instead of process_function(engine, labelled_batch)? For instance, process_function(engine, labelled_batch, device)? According to this great code, it is used additional function _prepare_batch, because of strong signature of process_function (otherwise, I would pass device to process_function as third argument).
question

All 7 comments

@Oktai15 well, probably it is not very clearly stated, but it is said here:
https://pytorch.org/ignite/engine.html#ignite.engine.Engine

process_function (callable) – A function receiving a handle to the engine and the current batch in each iteration, and returns data to be stored in the engine’s state.

But, true, probably, we should improve this part and clearly say about the only accepted signature:

def process_function(engine, batch):
     pass

Can I use another signature instead of process_function(engine, labelled_batch)? For instance, process_function(engine, labelled_batch, device)

Actually, restricted signature is not a real problem, IMO. You can define your other arguments with functools.partial:

from functools import partial

def process_function(engine, batch, device):
    # ...

device = "cuda"
trainer = Engine(partial(process_function, device=device))

What do you think ?

PS. If you would like to contribute to improve the documentations, feel free to send some PRs :)

@vfdev-5, I am not a highly skilled in python, but I have the following ideas:

  • We need to update docs using typing. Something like this:

process_function (Callable[Engine, Any] -> Any) – A function receiving a handle to the engine and the current batch in each iteration, and returns data to be stored in the engine’s state.

What do you think?

  • We need to change signature of process_function. My suggestion:
def process_function(engine, batch, *args, **kwargs)

I want to notice that we implicitly use this idea for handler. But in this case, we will have different approaches for throwing arguments for process_function and handler.

We need to update docs using typing

Actually, there is an issue to improve the code with typing, https://github.com/pytorch/ignite/issues/651. So, yes, this improvement could help.

Just, the update of documentation using typing without updating the code, IMO, is not sufficient...

We need to change signature of process_function

For instance, for me, there is no real need to add other arguments. We can keep this as an idea to think about. IMO, some practical use-cases should drive such change...

The process_function already depends on "global" variables such as the model which is always used without being an argument. From this point, I don't see the use of adding new arguments..

@Oktai15 I close the issue as solved. If you think we need to discuss more on it, please feel free to reopen it

Hi @vfdev-5, I am trying to use this great package to accelerate my development cycle. However, I cannot quite understand why the function process_function() must take engine as its input argument. The only usage of it is in the _run_once_on_dataset() method to compute the self.state.output. However, the argument engine (or self when called in the class) doesn't do anything or save anything with it. Are there any deeper design thoughts on it? Or is it only used to show deliberately that the process is associated with a specific engine? Thanks for your comments in advance.

@ZhiliangWu having engine as an argument of process_function can help with fetching easily the current state: iteration, epoch, custom variables etc. For example, when during the training you have total loss function composed of two part and an alpha parameter that is also updated during the training:


def train_step(engine, batch):
     # ...
     loss_1 = ...
     loss_2 = ...
     if engine.state.epoch > num_epochs / 2:
         total_loss = loss_1 + engine.state.alpha * loss_2
     else:
         total_loss = loss_1
     # ...


trainer = Engine(train_step)
trainer.state.alpha = 0.1

@trainer.on(Events.ITERATION_COMPLETED)
def update_alpha(engine):
    engine.state.alpha += 0.001

This is just an example, but you can imagine other use-cases where usage of trainer's state can be interesting.

Was this page helpful?
0 / 5 - 0 ratings