Fairseq should consider using prebuilt function/class registries internally
The current iteration of fairseq's internal architecture registry is slightly confusing. It ties itself very tightly to argparse, has no mechanism for building with non-argparse.Namespace arguments, and encourages a lot of black magic with default arguments in the argument parser. It would be better if the registry logic could be punted to some separate library which is well-tested and more idiomatic.
Ideally the general architecture wouldn't change: there would be registries for FairseqTask, Fairseq...Model, FairseqOptimizer, FairseqCriterion, etc. What would likely change is that there would be no bifurcation of registration and instantiation of these classes. For example, under the paradigm of class-registry, you'd have something like...
from class_registry import ClassRegistry
optimizer_registry = ClassRegistry()
@optimizer_registry.register("adafactor")
class FairseqAdafactor(FairseqOptimizer):
def __init__(self, lr=1e-4, adafactor_eps=1e-3, clip_threshold=0.0, ...):
...
# Pass the non-defaults when looking up the class
opt = optimizer_registry.get("adafactor", lr=1e-5, clip_threshold=0.001)
This is highly desirable because it decouples completely from argument parsing, makes code more compositional by allowing setup_task to work strictly in terms of registries, and it requires less maintenance of code that isn't strictly "business logic" as far as fairseq is concerned.
There are two main options: leave the code as-is or explore registry libraries. The primary registry libraries that I know are class-registry and catalogue. There may be others, but I assume they'll all be approximately equivalent.
One question to explore before diving deeper is how this might interact with the --user-dir flag: can a user just import the appropriate registry from fairseq and register their own architectures?
Another question is about how to communicate defaults to a user -- in the above formulation, defaults get handled by the class rather than by the user-facing command-line arg mechanism, so how can those defaults percolate back to a user? Are there global defaults? Do defaults get presented to a user currently? Is it a matter of documentation? Will we need black magic? 鈿★笍
Motivation makes sense, especially as you鈥檙e exploring decoupling from argparse.
I鈥檇 be reluctant to take an external dependency for something like this, since registry code is fairly small and easy to implement ourselves, once we decide what to implement :)
Regarding --user-dir, yes that鈥檚 exactly how it works. People can register their own tasks, models, etc. without needing direct access to the fairseq source (e.g., if using a pip installation). When a component is registered it injects the appropriate command-line arguments into fairseq-train so you can use your custom task/whatever. There are some docs about it at the bottom of this page: https://fairseq.readthedocs.io/en/latest/overview.html
Yep. 馃槃 I guess the question I was asking was more if we were to move the route of some external dependency - I am not sure how _they_ would work with --user-dir.
I can play with some exploratory refactoring of the registry mechanism now to move to something closer to what's described above and we can discuss pros/cons as the code evolves. Does that seem reasonable?
I think some of this will be solved by Hydra. Also take a look at how tensor2tensor does it, since I believe they support a similar registry but without injection of command line options.
I've been reading about hydra and it seems very promising, but likely will require #1672 first because it has a similar mechanism to constructor argument population as described above.
Hi Folks, happy to see Hydra is on the radar of fairseq.
@myleott, we should meet and chat about what the integration can look like.
There are several non-obvious features of Hydra that are specifically designed with frameworks like fairseq in mind.
There's some context in another issue that basically equates to: hydra is going to allow for programatic hyperparameter sweeps -- currently I don't think there's a mechanism for this outside of using unix. 馃槅 Very excited for this!
See here.
I landed the Ax sweeper plugin to master today.
see website docs.
Most helpful comment
Hi Folks, happy to see Hydra is on the radar of fairseq.
@myleott, we should meet and chat about what the integration can look like.
There are several non-obvious features of Hydra that are specifically designed with frameworks like fairseq in mind.