This is a proposal to not load everything with import transformers, but instead load things as they are needed.
For example what is realistic usage pattern for tensorflow in transformers - I know we have USE_TF=False, but perhaps it can be made by default False and only load it if it's actually needed based on usage patterns and not with import transformers? Also there was a particular segfault with tf/cuda-11 vs pt/cuda-10 - the 2 couldn't be loaded together - the issue didn't get resolved.
Same goes for integration packages (wandb, comet_ml) and probably a bunch of other packages some of which are quite big. The problem is that each of these packages tends to have various issues, e.g. fetching old libraries, impacting init, messing with sys.path and overriding global warning settings (mlflow was imported by PL - a seq2seq issue). Last week I was hunting all these down - and most have been fixed by now I think.
The problem with integrations specifically is that currently we don't assert if say comet_ml is misconfigured, we issue a warning which gets lost in the ocean of warnings and nobody sees it. If, for example, the user were to say "use comet_ml" and it were misconfigured their program would have died loud and clear. Much cleaner and faster for the user.
Relying on "well, it's installed, let's load it" is not always working, since often modules get installed as dependencies of dependencies and aren't necessarily the right versions or configured or else, especially if transformers did not specify these modules as explicit dependencies and doesn't know the requirements (versions) were enforced.
And a lot of these packages emit a lot of noise, especially if one uses more recent python and packages - deprecation warnings are many. tf as always takes the first place, but other packages are there too.
Loading time is important too, especially when one doesn't run a 1-10h program, but is debugging a program that fails to start. e.g. loading tf may take several seconds, depending on the hardware.
Clearly transformers wants to be easy to use. So perhaps by default import transformers should remain load-it-all-I-want-things-simple.
And we need import transformers_lean_and_mean_and_clean which wouldn't load anything by default and ask the user to specify what components she really wants. I haven't yet thought specifically of how this could be implemented but wanted to see whether others feel that a more efficient way is needed.
on slack @thomwolf proposed looking at how Optuna implements lazy loading of packages.
@LysandreJik, @sgugger, @patrickvonplaten, @thomwolf
It's hard to debate without seeing actual code on this. Am I 100% happy with the current implementation? Not really. But it's simple enough that the code stays easy to read. I'm afraid something more dynamic (like importing tf only when instantiating a TFModel for instance) would mean harder code. So I reserve my judgement on seeing an actual PoC to evaluate the benefits of a different approach vs the code complexity it introduces.
Ok, gave it a go and worked on a PoC here: https://github.com/sgugger/lazy_init
It lazily loads objects when they are actually imported, so won't load TF/PyTorch until you try to import your first model (which should speed up the import transformers a lot and avoid unnecessary verbosity). Let me know if you have any comments on it @stas00 !
Looks awesome, @sgugger! Thank you for doing it!
So how do you feel about it now that you have coded it? Will this make things unnecessarily complex and possibly introduce unexpected issues?
Perhaps start with just tf/pt, see how it feels - and then expand to other modules if the first experiment flows well?
Since it's limited to the inits, I'm fine with it. The idea is to collect feedback this week and start implementing it in Transformers next week.