Ignite: Are there any ways to filter out or ignore a batch in engine?

Created on 30 Apr 2020  ·  7Comments  ·  Source: pytorch/ignite

❓ Questions/Help/Support

I'm really new to the ML area and I'm trying to train a network with a dataset that when sampling batches, it can create batches with really large sizes (the sampling method is just torch's weightedSampling). Though the batch size is fixed the data size in the batch will be really large sometimes.
It seems pretty complicated to define a customized sampler, and the way I originally did was just ignore that batch when the data size in the batch is too large. Now, I'm trying to use ignite as the framework to train the network, I don't know how to do this.
I think it might relate to event_filter? But it seems to be triggered by event, so are there any ways to pass in a batch instead?

Thanks so much for your time!

enhancement help wanted

All 7 comments

@CDWJustin Thank you for your question !!

I didn't catch

Though the batch size is fixed the data size in the batch will be really large sometimes.

So, I understand you want to ignore some batches during loop. I don't think we have this feature in engine. @vfdev-5 do you confirm ?

@CDWJustin currently, there is no way to skip easily an iteration in ignite.

With the current code base, we need to return the output anyway, so skipping a batch can be done as


def train_step(engine, batch):
     if not is_correct_batch(batch):
         # let's return the previous output => wont break other handlers using state.output
         # but metrics computation will be impacted... 
         return engine.state.output

     # otherwise, continue as if everyhing is OK

trainer = Engine(process_function)

In general, when skipping the batch you can output anything you would like: previous output (as in the example), zeros etc.

There is a similar request here : https://discuss.pytorch.org/t/how-to-make-ignite-trainer-handle-error-batch-instead-of-stop-the-program/75886

So, let's improve this part in ignite. I would suggest to have another method like terminate_iteration(), similar to existing terminate_epoch() or terminate()

### Engine.__init__
        self.should_terminate = False
        self.should_terminate_single_epoch = False
        self.should_terminate_single_iteration = False

### Create new method in Engine
def terminate_iteration(self):
    self.should_terminate_single_iteration = True

### Engine._run_once_on_dataset
# https://github.com/pytorch/ignite/blob/master/ignite/engine/engine.py#L734
self._fire_event(Events.ITERATION_STARTED)
self.state.output = self._process_function(self, self.state.batch)
if self.should_terminate_single_iteration:
    self.should_terminate_single_iteration = False
    break
self._fire_event(Events.ITERATION_COMPLETED)

cc @sdesrozis

Excellent and easy to do 👍🏻

In the previous example, we still need to output something, e.g. None. The difference is that it wont go into other handlers attached to ITERATION_COMPLETED and which uses state.output.

Update: maybe it can be also catched after Events.GET_BATCH_COMPLETED:
https://github.com/pytorch/ignite/blob/b4f1035b6ba575ca32bafa0b4a083ff39983495e/ignite/engine/engine.py#L696
in case if we would like even stop before going into _process_function.

Thanks!!
BTW, if I used conda to install ignite, is there a way to update the changes you made to master branch? Or do I need to build it from source?

@CDWJustin you can intall nightly release with conda (https://anaconda.org/pytorch-nightly/ignite)

conda install ignite -c pytorch-nightly

Currently, it corresponds to https://travis-ci.org/github/pytorch/ignite/builds/681588654

Cool, thanks!

Was this page helpful?
0 / 5 - 0 ratings

Related issues

alykhantejani picture alykhantejani  ·  3Comments

kilsenp picture kilsenp  ·  3Comments

vfdev-5 picture vfdev-5  ·  3Comments

Sudy picture Sudy  ·  4Comments

sisp picture sisp  ·  3Comments