Mypy: Accept a list of files, or whole packages, to typecheck

Created on 15 Oct 2015 · 9Comments · Source: python/mypy

I'd love to be able to hand mypy a whole library, or a program where it's not totally obvious what the entry point is (or if there is a single entry point), and have it typecheck the whole thing.

For example in mypy itself, to typecheck mypy/ with all the Python files that make it up. We used to have tests.py as a standin for this, but runtests.py is more complicated and mypy runtests.py doesn't actually check everything. I can use mypy -m mypy.main, but that's getting into details about how the code is laid out.

One interface for this would be to just list a bunch of files as positional arguments to mypy. More convenient in many cases would be to say "here's a directory and typecheck everything underneath it".

Given the mypy example, actually, where mypy/test/data/ contains Python files that aren't part of the program, maybe the right option to have there is "here's a _Python package_ and typecheck everything that's part of it". So e.g.

$ mypy -p mypy

would look up mypy for import, see that it's a package module, go find all the other files next to its __init__.py and the directories that themselves contain __init__.py, and recurse. Because there's no mypy/test/data/__init__.py, or equivalently because you can't import mypy.test.data.fixtures.list, this wouldn't pull in mypy/test/data/fixtures/list.py, which is just as desired.

I think in my ideal version of this interface, you could pass multiple -m arguments, multiple -p arguments, and multiple positional filename arguments, and they'd all just go on the queue together.

(In principle I could simulate this by invoking mypy once for each file, but that's much slower because it repeats the work of typechecking dependencies.)

feature

Source

gnprice

Most helpful comment

We've had mypy --package dotted.package.name for a while now (since 11/7 it seems; the usage message just doesn't mention it yet). Maybe that's sufficient? It also looks fairly simple to add support for processing multiple files, but it's a little tricky to figure out what the module names should be -- currently when a single file is given the module name is always taken to be __main__, but you can't have multiple modules with the same name, and there are some different cases:

mypy /path/to/folder/*.py where folder is just a folder full of top-level files: the module names could just be the basename of each file with .py stripped
mypy package/subpackage/{module1,module2,module3}.py where we really want the module names to be package.subpackage.module1 etc.

We could distinguish these with some other flag, or with a heuristic based on the shape of the filenames or their relationship to the current directory or even to $PYTHONPATH, but it would be nice if both cases just worked without any help. Maybe it would work to crawl up the path until a folder without a init.py is found?

gvanrossum on 2 Dec 2015

👍3

All 9 comments

We currently do our argument parsing by hand, it would probably be good to refactor the whole process after deciding on the UI we want.

I _think_ that doing more than one -m at once is pretty much just adding more than one initial state... -p would be similar but generate a whole bunch of -m states by traversing trees.

jhance on 16 Oct 2015

I have a prototype for allowing BuildManager to support multiple sources.

No idea if it actually _works_ for multiple sources, but it accepts them, and doesn't fail on the one-source case.

jhance on 16 Oct 2015

This would be really nice! I'm not totally sure about -p yet -- we could define mypy <dir> to only look for subdirectories with __init__.py files (but that might be confusing). Multiple -m options totally make sense and multiple files.

JukkaL on 16 Oct 2015

Yeah, I thought about having a rule like that and figured it would be surprising to people -- generally if I pass a directory to a command, I expect the command to operate on the whole tree, like git grep, rsync, etc. So if I say mypy foo/ and it comes out clean, I'd be pretty surprised if I later find a type error in some file under foo/ and it wasn't checked because of some rule involving the presence of __init__.py files.

You could imagine also supporting passing a directory as a positional argument and meaning to check all the files in that tree. But this is likely to bite you in a different way: if you really have Python files under the tree that don't have a chain of __init__.py leading to them, then they can't be imported without an appropriate sys.path entry, and they probably depend on neighboring files in the same situation. So the check is likely to fail complaining it can't find some modules. Maybe that's OK, though, and certainly the small-project common case is that -p foo and a positional foo/ would be synonymous.

Suppose we accepted positional foo/, and defined it to mean "all *.py files in the subtree foo/". Would that work for mypy itself? Currently several files in mypy/test/data/fixtures/ fail to typecheck, but I don't really know the context of those files and whether it might be easy to fix them, or to move them out of mypy/.

gnprice on 16 Oct 2015

Some additional ideas:

Support .mypyignore files. If a directory has such as file, mypy would ignore it unless you explicitly point mypy to a file within it. That way a positional mypy/ could be made to do the right thing.
Rename -p to --recursive-package or something to make it more explicit what it does. We might want to reserve -p for things like -p /usr/bin/python3.4 (for example, to look up Python module search path and Python version dynamically).

JukkaL on 17 Oct 2015

👍1

Sure, renaming -p to a long option would be fine -- I think the main use of it would be in scripts for CI anyway (so long as we also have positional foo/.) I'd call it just --package, as "recursive package" is redundant: Python "packages" can have packages inside them, and when module foo.bar.baz exists it is part of the package called foo. See the use of the term in https://docs.python.org/2/reference/simple_stmts.html#the-import-statement and https://docs.python.org/2/tutorial/modules.html#packages .

I'd rather have a solution that didn't involve people adding one more kind of special-purpose marker file to their trees. I think we can do

positional foo/ means "all *.py files in the subtree foo/", which works for casual cases so people don't need to learn anything;
--package foo means the files that are actually in the importable package foo, which takes care of subtler cases when necessary;
both of these and -m can be combined any number of times;
and rely on __init__.py files to already describe what people are likely to want with .mypyignore. When people really need something more custom and ad-hoc, they can always just construct the list themselves and put it on the command line.

gnprice on 20 Oct 2015

I like that!

JukkaL on 21 Oct 2015

alunduil on 29 Oct 2015

mypy /path/to/folder/*.py where folder is just a folder full of top-level files: the module names could just be the basename of each file with .py stripped
mypy package/subpackage/{module1,module2,module3}.py where we really want the module names to be package.subpackage.module1 etc.

gvanrossum on 2 Dec 2015

👍3

Was this page helpful?

0 / 5 - 0 ratings