Mypy: Test stub files against real code

Created on 13 May 2018  路  18Comments  路  Source: python/mypy

There are a few use cases when stubs may be compelling over type annotations or type comments:

  • annotate code that needs to be Python 2 compatible, ideally by using type annotations (so no type comments),
  • often times users prefer to have type information outside of the source code (this is to avoid modifying legacy code change - and in practice lowers resistance from maintainers),
  • the maintainer of the library does not want to take on the in-line type annotations, someone else trying to provide it as a separate package.

However at the moment this has some limitations: stub files are used only to check clients of the stub, not the code itself.

# a.py
def add_int(a, b):
    return a + b

add_int("a", "b") # mypy does not complain, actually ignores the entire file
# a.pyi
def add_int(a:int, b:int) -> int: ...
# client.py
from a import add_int
add_int(1, 2)
add_int(1, "2") # here mypy is going to complain

In case of the above files mypy will warn about bad usage of inside client.py, but not in a.py (that is, mypy does not validate against the source files being annotated by the stub). This has serious drawbacks:

  • maintaining the stub files is hard, now users need to manually make sure whatever is in the stub file is an accurate representation of the matching source files (both interface - aka number of arguments, names - and type wise),
  • the stub creator does not know if the source code itself is type correct or not.

Here I propose to implement a way to support testing stub files against the real code:

  • build the syntax tree for the source file,
  • build the syntax tree for the stub file,
  • merge stub tree into the source tree

    • source file only tree available: dynamically typed - noop

    • abstract syntax tree only: raise error - missing annotated sources

    • both stub and source AST exists:

    • for all source elements for what there is a matching stub element



      • copy inject over type annotations


      • complain if function interface missmatch



  • now just run the existing type checking mechanism on the enriched source code ast

Merging the AST definitely is not trivial. The plan is to start with a POC implementation, that works for the most cases, and then see the viability of this approach. Note this would help a lot both stub maintainers I think (typeshed) and projects that need Python 2 support.

feature needs discussion priority-1-normal

Most helpful comment

Well, I use stubs with python3.7 because sometimes stubs are really ugly. And I am happy that they live outside of the source code.

Example:

All 18 comments

This is not a new idea. Do you also have a plan for implementing this, or are you just requesting that the mypy core team takes this on?

@gvanrossum I'm aware of this. I've chatted earlier today with some mypy core team members. Actually the plan described here is something we mostly came up together with @JukkaL. I do plan to implement this, or at least help with this (hopefully get the ball rolling during the sprints next week). That being said created the issue to discuss the general approach before anyone starts coding. I did not found a ticket dedicated existing, so created a new one.

I've got an initial proof of concept version of this working (https://github.com/gaborbernat/mypy/commit/a56f007386daab9229af957ac252b3397146b8dc). Will give it a test run trying to use it to type hint the https://github.com/tox-dev/tox code base, which contains a lot more edge cases I'm sure it's not handling yet. And fix the issues coming up. Once manage to do that I'll create the PR for us to review/discuss it (probably will take a week or so).

Are there any updates on this? As someone using stub files exclusively, this is very unintuitive behavior (and hard to realize/easy to forget). It also makes type checking if __name__ == "__main__":" sections complicated (maybe impossible?).

@gaborbernat are you taking your proof of concept further? It would be extremely useful.

Yeah it's a work in progress. Sadly it's quite complicated to get it right and touched a few code places that were modified in the meantime ( so needed to rebase and resolve merge conflicts like three times by now), and my availability was also fluctuating. Long story short we agreed on design and implementation. Will probably be ready for merge by middle of April. Definitely commited to get it stable by PyCon Us in early May.

So just an update on this. We have the ability to merge the stub into the source code via https://github.com/ambv/retype. The retype tool will warn if the stub does not match the source file. Then on the merged source code, we can run our type checker to detect issues (see example https://github.com/ambv/retype/blob/master/tox.ini#L53-L59). Merges https://github.com/ambv/retype/tree/master/types into https://github.com/ambv/retype.

Issues with this approach are that you lose context of what content comes from the source and what content comes from the stub. mypy errors are hard to fix as errors are reported on the merged source information; mapping back to the original files can be troublesome. retype also fails at the first conflict so requires a lot of run-fix-run cycles.

The plan going ahead is to make retype also report some line mappings (allowing at this point to generate more localized errors). Also, retype should run and report all errors rather than stop at first. The later issue is of a mediocre hard. The first is hard though. The problem is that at the moment retype parses stubs via the typed_ast, the source as `lib2to3``. And performs the merge via dumping the typed AST into the lib2to3. Parsing the stub as a source is a no go though, as we lose line information needed to generate the map. In the first instance, we would need to parse the stub at least also as a lib2to3. This is a major rewrite of retype sadly. Will continue iterating on this.

imho with python 2.7 and 3.5 EOL around the corner the use-case for external stubs seems to diminish.

Already people are choosing to put their stubs separate from their code because they don't like the "aesthetic" of type annotations. In my opinion that makes this ticket describe an anti-feature, and just provides these people with ammunition to avoid applying types directly to their code.

Well, I use stubs with python3.7 because sometimes stubs are really ugly. And I am happy that they live outside of the source code.

Example:

@graingert I don't quite understand your comment. At first it sounds like, with the introductions of annotations we don't need stubs anymore (I disagree), then it sounds like you actively want to prevent people from using type stubs by making it harder, thus this would be an anti-feature.

Could you clarify?

Type stubs have fallen out of favor with me, but especially for legacy code and where I currently work there is a lot of push back for in-code annotations and a solid type stub option would surely help getting their feet wet.

It is my opinion that mypy should focus on fostering community support for inline type annotations as the default and preferred option

Fair enough, but I don't think treating a feature, which has non-equivalent uses as a red headed step child is a good way to achieve that. It might just split the adoption of type hints in the community in general.

Unless type stubs were always just meant as a holdover. But they seem way too useful and in some cases unavoidable for that.

Stub files are necessary in cases where the libraries you are using do not have annotations and the stub library for them makes a type generic. Since the subscript operator isn't defined normally you can't inline the annotations all the time. Encountered this with the django stubs.

@ruler501 In annotations you can string literal escaping to write generic types even when indexing doesn't work at runtime. Example: def f() -> 'LibraryType[int]': ... You can also use from __future__ import annotationsand skip the string literal escapes in Python 3.7+.

Stub files are necessary in cases where the libraries you are using do not have annotations and the stub library for them makes a type generic. Since the subscript operator isn't defined normally you can't inline the annotations all the time. Encountered this with the django stubs.

To me this seems like a problem caused by stubs rather than solved by them

@ruler501 In annotations you can string literal escaping to write generic types even when indexing doesn't work at runtime. Example: def f() -> 'LibraryType[int]': ... You can also use from __future__ import annotationsand skip the string literal escapes in Python 3.7+.

@JukkaL that doesn't work for inheriting though. The specific case here is roughly you have a Field class that holds a value of some type so the stubs make it explicit as Field[T], but now if you want to make a custom Field[str] I was under the impression you can't inherit from the stub version to supply the type argument to the super class so lose the ability to refer to it safely by the generic supertype without losing information.

@ruler501 Yeah, inheritance is tricky. You can use this ugly workaround, however:

from typing import TYPE_CHECKING

from foo import Field

if TYPE_CHECKING:
    Base = Field[str]
else:
    Base = Field

class Derived(Base):
    ...

we would need to parse the stub at least also as a lib2to3
(re: https://github.com/python/mypy/issues/5028#issuecomment-495942769)

I'm thinking about augmenting stdlib ast module to have some lib2to3 features (because llibib2to3 is going away). If this is of interest to you:
https://mail.python.org/archives/list/[email protected]/thread/X2HJ6I6XLIGRZDB27HRHIVQC3RXNZAY4/
(No promises that I'll do the work of course, but I have high hopes)

Was this page helpful?
0 / 5 - 0 ratings