Mypy: Design and build a plugin architecture

Created on 24 Feb 2016  ·  26Comments  ·  Source: python/mypy

One area where this comes up is SQLAlchemy table definitions.

needs discussion priority-0-high topic-plugins

Most helpful comment

zope.interface is a perfect candidate to add to this discussion as it's often used for this exact purpose in many frameworks including twisted and pyramid.

All 26 comments

I was just gonna create a similar ticket, code using Injector[1] would be another case here.

[1] https://github.com/alecthomas/injector

I'm enumerating here a bunch of things that this might be good for. These fall roughly under two categories -- metaclasses (or more generally runtime configuration of classes) and callables with atypical signatures.

Metaclasses:

  • SQLAlchemy
  • Django models
  • (and ORMs in general)

Functions with special signatures (some of these are of marginal utility):

  • open() (and similar functions elsewhere; return type depends on mode string argument)
  • re.findall (return type depends on structure of a regular expression)
  • sys.exc_info (return value has no None items within except: block)
  • universal_newlines in subprocess (boolean toggles between str / bytes)
  • str.format and '...' % x (the latter is currently special cased in the type checker)
  • dict(key=value) and dict.update with keyword arguments (the prior is currently special cased)
  • sum (works for anything that supports +)
  • Parts of itertools and functools
  • struct (could infer type from format string)
  • SQLAlchemy queries

For some of these I can also imagine more general type system support instead of a plugin.

Two examples of libraries that generate modules and classes based on some form of data schema:

Other functions with tricky signatures:

  • pow (return type varies based on the sign of the exponent)
  • __pow__ (similar)

Things like numpy also fall here; some functions accept a "dtype" argument that controls the type of the output. Some functions return values or arrays depending on the arity of the "axis" argument.

zope.interface is a perfect candidate to add to this discussion as it's often used for this exact purpose in many frameworks including twisted and pyramid.

I have not followed the latest MyPy developments actively, but here's my two cents (Euro). There are some existing type systems out there that can be looked upon as an example. One is TypeScript where one can supply the type definitions as a separate declaration file: https://www.typescriptlang.org/docs/handbook/declaration-files/introduction.html This is done e.g. for jQuery that is a very popular legacy project and cannot be fitted with type declarations in the source code itself.

Based on TypeScript inspiration, one approach for a type declaration architecture for framework/metaclass use case could be

  • There is a standard for a type declaration file

  • Each project using their own flexible MetaClass and property based typing systems (SQLAlchemy, zope.interface) can supply a tool for walking the source hierarchy, either as a batch process or run time, scanning all the classes and creating type definitions for them. In this approach, one doesn't need to upgrade existing framework run-times like SQLAlchemy, with new type hinting directives but tool can work on old cod bases as well.

  • Generated type definitions are saved in a type declaration file

  • MyPy, IDEs, and other tools can consume type declaration files and match type declarations and classes e.g. by the dotted name

  • If source code changes one simply has to run framework specific type declaration generation tool again

  • This approach does not solve the dynamic return type issues like re.findall and sqlalchemy.Query(MyModel).all() -> Iterable[MyModel] and this question is still left open

@miohtama mypy follows a similar approach to the idea of allowing separate annotations for third party files that can not be annotated inline, so the idea of a "generator" can be useful in some cases. But I think caching the generated result might be a premature optimization, I think probably a first step should be to provide an API for plugins to walk "flexible" modules statically and programatically return the data types. with that API you could integrate it directly into mypy (and ensure you're always type-checking an up-to-date definition) and also in other "stub generator" tool if you want more performance (the first option would be nicer when developing, and the second one for things like CI, or IDEs).

@miohtama It seems that stubs are already (sometimes) suitable for what you are thinking about. I agree with @dmoisset that it would be better if the plugins would be deeply integrated to mypy so that there would be no need to run a separate generator tool. In order to always keep the generated stubs in sync with changes in user code, we'd probably have to run the generator tool before each mypy run, and this could actually slow down type checking significantly, as a large project may want to run multiple generators. (A large project probably depends on many third-party libraries, many of which may benefit from a stub generator.) If the plugins are integrated to mypy, there doesn't have to be significant performance overhead, assuming that the plugins only work at the AST/static checking level, i.e. they don't actually try to import user code to Python runtime.

Here are some thoughts on the user-plugin aspect of this PR.

Plugin discovery options

  • A. by file path. e.g. /path/to/myplugin.py. could also extend this with a MYPY_PLUGIN_PATH

    • pro: easier to write test cases (I discovered that placing a file on the PYTHONPATH within the tests was difficult, likely by design)

    • con: can't use pip to install plugins

  • B. by dotted path: e.g. package.module

    • pro: easy for users to create pip-installable plugins

    • con: adding plugin modules and their requirements to the PYTHONPATH could interfere with type checking?

  • C: setuptools entry points. e.g.:
    python setup( entry_points={ 'mypy.userplugin': ['my_plugin = my_module.plugin:register_plugin'] } )

Other questions

  • shall we always look for an object within the module with a designated named, e.g. Plugin, or make this configurable as well? e.g. package.module.MyPlugin or /path/to/myplugin.py:MyPlugin (Note: I've already written some functionality for the latter in my hooks branch)
  • do user plugins need to inherit from mypy.plugin.Plugin?

Plugin chainability options

  • A. aggregate all user plugins into a single uber-plugin instance.

    • each method on this aggregate plugin would cycle through its children in order until one returns a non-None result. we could then cache the mapping from feature (e.g. 'typing.Mapping.get') to user-plugin instance to speed up future lookups.

    • this is compatible with the current design which passes a single Plugin instance around.

  • B. register a plugin per feature. this allows you to replace the search with a fast dictionary lookup, as well as detect up-front when two plugins contend for the same feature.

How much more before we can close this?

I'll create separate issues for the various things we could use plugins for and close this issue.

I added various issues about how we could use the plugin system. Feel free to create new issues for additional things the plugin system could be useful for. Closing this issue -- future discussions will happen elsewhere.

Hi, I hope I'm not too late for the party here. After watching "Idris: Type safe printf" (https://www.youtube.com/watch?v=fVBck2Zngjo) I had some ideas for "atypical signature" kind of plugins which I think I can combine in a code sample below.

Let's say we want to model something like functools.partial (assuming it's a regular function):

def partial_type(arguments: Arguments) -> Callable:
    callable = arguments.by_name('callable')

    # Below we walk the provided arguments and keyword arguments types, make sure
    # they match the types of the callable parameters and amend the callable type
    # to return to express the fact that some arguments are already provided

    for a in arguments.args:  # arguments.args is a list of types
        corresponding_parameter = callable.parameters.by_index(0)
        assert a == corresponding_parameter.type
        callable.parameters.pop_by_index(0)

    for name, value in arguments.kwargs:  # arguments.kwargs is a list of tuples of (str, type)
        corresponding_parameter = callable.parameters.by_name(name)
        assert value == corresponding_parameter.type
        callable.parameters.pop_by_name(name)

    return callable

@dynamic_return_type  # or something
def partial(callable: Callable, *args, **kwargs) -> partial_type:
    # ...

What I believe is nice about approach like this:

  • no plugin registration needed
  • code that generates return type can be declared almost inline
  • no classes or inheritance, only plain functions

The hard part here is mypy (and anything using those) would have to actually execute part of the source code being analyzed (possibly restricted to some pure subset).

Food for thought.

@jstasiak A proposal similar to yours has been discussed before, and we decided to have the plugins live outside user code. Here are some of the primary reasons why we didn't go with 'inline plugins':

  • Mypy is a static checker that doesn't execute the code being checked, and running some parts of the checked application code would confuse the mental model users have.
  • The plugins may have complex dependencies on mypy internals, and we don't want user code to depend on mypy internals.
  • The plugin and mypy internal APIs are not going to be stable, so we will actually discourage people from writing their own plugins that live outside the mypy repo unless they have a very pressing need. Making the programming model as simple as possible is thus not very high priority -- making the plugin system easy to implement, easy to evolve and flexible are more important, and the current design seems to fit those priorities pretty well.
  • For library modules mypy uses typeshed, and we don't want to add mypy-specific plugin functionality to typeshed, since the stubs are used by other tools as well. We also don't want to standardize a cross-tool plugin API (at least right now) since it would be too hard to do and restrict our flexibility to evolve the plugin system.

@JukkaL , em ... And what I should do, if I want to provide type hinting with mypy for my custom dataclasses library? Write plugin directly in mypy source code? And every single man, that use some python magic should do the same?

Right now, you can write a plugin and configure it from mypy.ini using a
path relative to the directory where the mypy.ini lives.

I've heard there are plans to make plugins specifyable as modules (so
anywhere on PYTHONPATH will work -- not MYPYPATH, as plugins are part of
the type checker, not part of the checked code).

@JukkaL Given the team's stance on user plugins ("we will actually discourage people from writing their own plugins that live outside the mypy repo"), I wanted to ask: would it okay if I linked my notes for developing one here? (prefixed with huge disclaimers about API instability etc) Since the API isn't really documented¹, it took me quite a while to figure out how to get anything going, and I thought I might save someone that effort. Of course, if you'd prefer to keep a higher barrier to entry, I'll respect that.

¹ I couldn't find anything except for this autogenerated docpage, GVR's comment above, and some bits of information scattered around the issue tracker.

we will actually discourage people from writing their own plugins that live outside the mypy repo

Where do you read this? I don't think this is true.

@ilevkivskyi Here:
Comment screenshot

From this comment.

OK, thanks! I think this statement is outdated (and maybe was more in response to concrete proposal). Although the plugin API is still unstable (i.e. no guarantees about backwards compatibility), we now support user installed plugins.One can just install a mypy plugin using pip install and activate it in mypy.ini by writing plugins = plugin_a, plugin_b.

Maybe @JukkaL can add more.

@ilevkivskyi I'm glad to hear that! From reading the code, I also figured out that you can do
plugins = user_plugin_a:some_function
to use user_plugin_a.some_function(...) as the hook mypy will call to obtain the actual UserPluginA(mypy.plugin.Plugin) class-object. (user_plugin_a.plugin(...) will be used be default)

Yeah, you can now have plugins that are installed separately from mypy. I still think that it may be worth having plugins for some 3rd party libraries in the mypy repo, but that really only makes sense if the APIs are relatively stable. There are at least three reasonable ways to maintain a plugin:

  1. The plugin lives completely separately from mypy. The main tradeoff here is that it's possible that mypy changes break your plugin, forcing you to update your plugin every once in a while. There is also a risk for users -- at some point in the future the plugin maintainers may lose interest and the plugin will stop working with recent mypy versions.
  2. The plugin lives in the mypy repository. Here the main benefit is that the mypy team will ensure that the plugin keeps working even if the plugin API changes. On the other hand, if the library API is changing rapidly or in incompatible ways, there may be incompatibility issues that are hard to work around, and the plugin may become out of date with respect to the library API. This obviously requires that the mypy team is ready to maintain the plugin.
  3. Start with a plugin that is separate from mypy and move it to the mypy repository once things are stable enough.

@JukkaL since you're here, is there a way to safely inject additional definitions into a file from a hook? If my plugin has a class decorator hook, I can access the classes MypyFile by poking around in the ClassDefContext my hook receives, but adding anything to the file's defs list sounds like a recipe for trouble since the list is being iterated over when the hook is called.

(I need to turn this:

@triggers_hook
class A:
    ...

into this:

class A:
    ...

class B(A):
    ...

for typechecking purposes. But since you can't normally do

class A:
    class B(A):
        ...
    ...

, I need to add B at the module level. Although maybe mypy can handle an inner class like that?)

@lubieowoce Adding definitions to MypyFile in a plugin is not supported at the moment.

[edit: I feel like I shouldn't spam this thread with questions, is there a good place for that?]
@JukkaL I got around it by putting the subclass into the parent class, works okay for my purposes 👍

Another question: A method_hook receives a MethodContext object with a context: Context attribute – usually the expression that triggered the hook. Is the hook allowed to modify context in any way, e.g. to transform the AST into a form that's more palatable to mypy?

(in particular, I'd like to turn certain method calls like x.is_Y() into isinstance(x, Z). I'm trying to do that at the AST level, because a literal isinstance(...) seems like the only recognized way to branch on type.)

Was this page helpful?
0 / 5 - 0 ratings