Pytorch-lightning: Replacing the use of Mixins with composition

Created on 9 Apr 2020 · 9Comments · Source: PyTorchLightning/pytorch-lightning

🚀 Feature

Use composition instead of inheritance and mixins.

A typical way of using this would go something like the following:

In such a world, the trainer can be instantiated like so:

trainer_instance = Trainer(dataset_reader_instance, model_instance, training_loop_manager, callback_instances_list, platform_specific_kwargs ...)

Here, everything non-essential including logging, early-stopping, etc. goes into the callbacks. Every callback while registering itself with the trainer, will tell the trainer what attributes it requires on the trainer to function. Once all the callbacks register themselves the trainer does a sanity check — warn if multiple callbacks want to access the same attribute, error out if two callbacks have asked for exclusive write access for an attribute, etc.

Additional benefit:

Now, the issue would be that the user has to instantiate these different instances in their final training script. We can automate that using a config manager like in AllenNLP and provide the user with a standard CLI train, test, predict commands. This way if a user wants to use their own dataset to train a model, they need to define a reader class for it in a separate file, modify one line in an existing config JSON or YAML file and use CLI command train new_config.yaml

Concern:

If the worry is that the user needs to learn about all the callbacks to even start training a model, then that can be addressed by providing a sample/default config file which includes the configuration for all important callbacks. So in essence, if the user wants to use the default training setup, they just need to copy the default training config file and change the model and dataset configuration.

Motivation

Code using mixins is hard to read, debug, extend and test if the mixins are not completely decoupled. Moreover, using composition coupled with an automatic way of instantiating classes will make for a very clean and fast user interface.

Pitch

Currently, the same as what is provided above in the Feature Proposal. Will update as/if things become more concrete.

Alternatives

Decouple the mixins, define their responsibilities clearly and have detailed "for developer" documentation describing these.

Additional context

Slack conversation link

Important discussion enhancement help wanted

Source

dhruvdcoder

👍2

Most helpful comment

My two cents is that the external API of PL has already proven it's a good API and should not be modified (and I don't even talk about the burden of breaking backward compatibility...).

As for the internal API, I always found the mixin pattern to be super confusing when reading PL code. That being said I don't see a cleaner way to do it. @dhruvdcoder if you can come with a better and cleaner pattern than the current mixin one, feel free!

hadim on 9 Apr 2020

👍2

All 9 comments

Hi! thanks for your contribution!, great first issue!

github-actions[bot] on 9 Apr 2020

If the worry is that the user needs to learn about all the callbacks to even start training a model, then that can be addressed by providing a sample/default config file which includes the configuration for all important callbacks. So in essence, if the user wants to use the default training setup, they just need to copy the default training config file and change the model and dataset configuration.

I strongly disagree with this... having to read docs to understand something means we were lazy with the API design and made something not intuitive... this goes against the core principle in lightning.

I'm happy to entertain rewriting the internals of lightning to make it more contributor friendly, but not the external API. I would look at all the past frameworks as lessons learned and not things done "correctly" -- especially anything in the tensorflow and keras family which have (in my opinion) really poorly designed APIs

williamFalcon on 9 Apr 2020

👍1

My two cents is that the external API of PL has already proven it's a good API and should not be modified (and I don't even talk about the burden of breaking backward compatibility...).

hadim on 9 Apr 2020

👍2

I understand your concern regarding changing the external API. I believe that the composition can replace mixins and inheritance by putting all the object creation logic inside the constructor (this compromises the reusability of the smaller components but let's ignore that for a moment) of the trainer. The difficult part is dividing the responsibilities amongst the smaller objects. However, one good thing here is that the existing mixins give us some idea about the logical boundaries that should exist among the smaller objects. We could start off by converting these mixins into independent classes that are instantiated and set as attributes inside the trainer with a rule that the trainer's self should not be passed to these smaller objects -- they should be able to function independently.

Thoughts?

dhruvdcoder on 10 Apr 2020

not sure. we need to think about this carefully. many of us have thought about the current structure very hard. maybe suggest a code snippet of how you envision this working?

ie: do a real example.

maybe classes to encapsulate each set of functionality isn’t bad but we need to think very hard about this because there’s a chance we can also make sure to make the framework more flexible in the process

williamFalcon on 10 Apr 2020

I will try to give it shot once I have some time. Hopefully after a couple of weeks. Feel free to close the issue for now, if there isn't anything else to discuss.

dhruvdcoder on 11 Apr 2020

As for the internal API, I always found the mixin pattern to be super confusing when reading PL code.

Mixin is a little difficult to read at times but I'm not sure if composition would be that much better. It's not too bad at all if you use grep -nr to navigate and find where things are being declared and called. I've personally found Keras and TF to be impossible to understand both in terms of their codebases and in terms of their external API in some ways.

@dhruvdcoder I've written something similar which used composition rather than mixin. One key problem was taking advantage of object orientated programming and inheirtance. Mixin actually gives you a really nice way to structure the code using the inheritance of objects, whereas it becomes messier when you use composition. I think mixin also has the advantage of having lower technical debt as compositions can easily get very complex and out of hand as things get bigger (which I think the problem Keras and TF ran into)

HenryJia on 19 Apr 2020

@dhruvdcoder how is it going, still interested in this PR?

Borda on 4 Aug 2020

@Borda I went back to using AllenNLP for my current projects and am unable to work on this currently. Sorry about that. Please feel free to close this issue.

dhruvdcoder on 5 Aug 2020

Was this page helpful?

0 / 5 - 0 ratings