Ignite: [FR] LR Finder

Created on 26 Aug 2019  路  14Comments  路  Source: pytorch/ignite

There is a repository which already has ignite version of LRFinder. It would be helpful to put this class to ignite/contrib/engines.

cc @ItamarWilf

enhancement help wanted

Most helpful comment

@vfdev-5 okay, I re implemented the LRFinder as a handler that can be attached to any engine, as long as some loss value is returned from the processing_fn, much like the API you suggested.

Could you take a look at the code and the example notebook and let me know what you think?

All 14 comments

Hi , thanks for finding the code useful, it's mostly based on the great work of @davidtvs

Right now It's not really implemented as an ignite Engine. I can re implement it as an Engine but it would probably have less functionality (no matplotlib based plotting, not measuring loss on val loader, etc...)

What do you prefer?

cc @vfdev-5

@ItamarWilf that's true, I didn't correctly analysed your code. At first I was thinking about a handler that should be attached to an existing trainer. But seing the example in your code : lr_finder.range_test(dataloader, end_lr=100, num_iter=100), I thought about inherited Engine.

Thinking more about LRFinder as Engine, this probably limits the usage to a supervised single model training. If we could extend it to any trainer, LRFinder is a handler, that can be attached to a trainer (and optionally user can pass an evaluator) it would be more interesting, IMO. Such handler can keep matplotlib plotting methods...

What do you think ?

Even in the most simple case, the lr_finder needs to use both a dataloader and optimizer to run the trainer for some iterations while increasing the lr, and that trainer must return some loss value, which already somewhat limits the trainer. Now, in order to have any effect on the trainer, it must reset the optimizer's lr using some guess for the correct lr (like the suggestion in fast.ai), which takes away control from the user.

Looking at more general uses like the DCGAN example (which has 2 optimizers, with potentially 2 unique learning rates), I'm not sure it can _easily_ work for any trainer.

Here's a link for a supervised single model implementation. The example in the docstring might help explaining the idea. @vfdev-5 Let me know what you think.

@ItamarWilf okay, I see, thanks for the clarifications. I understand that LRFinder needs in any case an optimizer to vary the learning rate, a trainer to get training loss and we need to save/restore initial state after running the LRFinder.
My only point is to replace somehow this line : https://github.com/ItamarWilf/ignite/blob/7812d507e3a35d1dbc6f6e8db441d81e6234b9df/ignite/contrib/engines/lr_finder.py#L59

Even if LRFinder aims to support supervised single model training, it would be more interesting to keep trainer's processing function independent and use user's trainer to just provide the loss as a reaction on the input learning rate.

An API as a handler:

trainer = Engine(processing_fn)

lrfinder = LRFinder(model, optimizer, output_transform=lambda output: output['loss'], **other_options)

lrfinder.attach(trainer)
trainer.run(train_loader, max_epochs=1)

# plot results
# detach lrfinder

Anyway, let me think more about this and its API.

@vfdev-5 hmm, now I think I got what you mean. I will try to make something more along these lines. Thanks for the feedback :)

@vfdev-5 okay, I re implemented the LRFinder as a handler that can be attached to any engine, as long as some loss value is returned from the processing_fn, much like the API you suggested.

Could you take a look at the code and the example notebook and let me know what you think?

@ItamarWilf

Could you take a look at the code and the example notebook and let me know what you think?

A small problem with your code: after running the trainer with LRFinder attached, the trainer's should_terminate is set to True, hence next time the trainer is run, it will terminate right at the beginning.

Another problem is that running trainer for multiple epochs doesn't work unless I do trainer.run(max_epochs=large_enough_number). See https://github.com/ItamarWilf/ignite/blob/62d71127a2fec2f9e9503146600f29e2ac75f3b4/ignite/engine/engine.py#L430. It compares current epoch with max_epochs, not with state.max_epochs.

Also I suggest you call this lr finder a more specific name (e.g. FastAILRFinder, or GuggerLRFinder (it seems Gugger is the author's family name) or something else), because it is obviously a very dirty and unpolished way to choose lr, in the future there will be better ways, and we don't want the simplest name LRFinder to belong to this type of lr finder.

@ItamarWilf sorry for delay, I'll test your code in details as I'll have more time. I agree with @philip-bl that we need to check properly that attach/detach both work as intended and trainer can do its job after using LR finder.
Concerning class name, FastaiLRFinder can be an option.

@philip-bl thanks for the review !
I added some changes to the code regarding your concerns:

  1. I believe that self.should_terminate is reset on every run (link). Either way I added a test to make sure that after detach the engine runs normally without terminating.

  2. Regarding the max_epochs issue, by default the engine runs for len(dataloader) * max_epochs iterations. If num_iter is explicitly given, than one of two things happen.
    if num_iter <= len(dataloader) * max_epochs, the engine stops when it reaches num_iter. Otherwise it raises a warning to let the user know what is the number of epochs he should run the engine for to reach num_iter.

  3. FastaiLRFinder it is :)

cc @vfdev-5

@ItamarWilf could you please send a PR with your code such that we can comment out in the code and follow the updates :)

@vfdev-5 No problem :)

I believe that self.should_terminate is reset on every run (link). Either way I added a test to make sure that after detach the engine runs normally without terminating.

Hmm. Perhaps I have an older version of ignite which doesn't do this.

Was this page helpful?
0 / 5 - 0 ratings