While working on protocols, I noticed that stubs often use Any where object will be more appropriate and vice versa. Because of how protocol inference works and how current mypy solver works, this leads to problems since sometimes mypy infers Any instead of giving an error (maybe we could improve this by "tightening" solver?)
I think the root of the problem is that wee don't have a good story about Any. I believe the rules for interaction between Any and other things should be like this (where C is a normal class like int or str):
(1) join(Any, object) == object
(2) meet(Any, object) == Any
(3) join(Any, C) == C
(4) meet(Any, C) == C
(5) join(Any, NoReturn) == Any
(6) meet(Any, NoReturn) == NoReturn
I use NoReturn for uninhabited type, since it is how it is currently called in PEP 484. The above rules could be summarized in a simple diagram:
|
Any (if it appears in a meet)
/ | \
... all other types ...
\ | /
Any (if it appear in a join)
|
NoReturn
Currently, mypy behaves in a completely different way, and I believe this is not right, let me explain why. We could focus on rules (3) and (4). Currently, mypy returns Any in both these cases. Here is why it is not safe:
from typing import Any
x = 'a'
y: Any = 1
lst = [x, y]
lst[0] >> 1 # not detected by mypy, fails at runtime, will be fixed by rule (3)
def f(x: int) -> None:
x >> 1
def g(x: Any) -> None:
pass
funs = [f, g]
funs[0]('a') # not detected by mypy, fails at runtime, will be fixed by rule (4)
@JukkaL I will be grateful for your opinion.
Hm, I have to think about this. I'm not sure I agree with everything in the first diagram either. I thought joins with Any always gave Any? (IOW why is rule (2) not the same as rule (4)?)
Off-topic there's also something with y: Any = 1 where I'd like mypy to behave differently: IMO it should infer that y is temporarily an int but allow assigning anything to it later.
@gvanrossum
I thought joins with Any always gave Any?
The reasoning why I don't like this is similar to why we recently prohibited simplifying Union[int, Any] to Any, this is not safe (see example).
(IOW why is rule (2) not the same as rule (4)?)
Rules (2) and (5) are different form others because I want to preserve the fact that object is an "absolute top" and NoReturn is an "absolute bottom". This is natural, consider this simplified reasoning: Any is something that has all methods and all values, but object is something that has all values but no methods, therefore object is wider that Any. Similarly we find that Any is wider than NoReturn.
OK, that explains (1), (2), (5) and (6) to me. That leaves (3) and (4). I think you're saying (3) join(Any, C) == C because C has fewer methods than Any, while (4) meet(Any, C) == C because C has fewer values.
It seems the right way to think about the lattice is that up (join) is the direction of fewer methods and more values (with fewer methods being tested first), while down (meet) is the direction of fewer values and more methods (with fewer values being tested first).
PS. Please have pity on my feeble understanding. I have never had any theory classes about this at all, since except for Simula, OO was invented after I graduated!
The join with Any is Any to avoid false positives. For example, consider this code:
class A: pass
class B(A): pass
def f(x: B, y: Any) -> None:
a = [x, y]
g1(a) # should this be fine?
g2(a) # what about this?
def g1(a: List[A]) -> None: pass
def g2(a: List[B]) -> None: pass
I think that both calls should be accepted, since each could be okay, given a suitable precise type for y. The philosophy is that replacing a type with an Any type should not generate additional errors if we can avoid that. That's why the inferred type of a is List[Any], not List[B].
So you disagree with (1) and (3) from the OP?
(1) join(Any, object) == object ... (3) join(Any, C) == C ...
@JukkaL
The philosophy is that replacing a type with an
Anytype should not generate additional errors if we can avoid that.
g2(a) in your example passes with both versions of join, so let us focus on g1(a). Indeed, for y: A it will pass, but will start failing if replaced with y: Any in the case of proposed join. However, there is something that I don't like here, namely:
Any, let me explain my intuition: If I see a list of B and "something that always work", then I would expect that a user wanted to create a list of B, not a list of things that will work everywhere.Of course, a user could always overrule the inference with an explicit annotation, but it seems to me it is better to silence an error with a: List[Any] (or even better with a: List[A] for the case of g1) rather than have a (possibly subtle and nasty) uncaught error. What do you think?
@JukkaL
Also, note that in your example invariance is important, everything will still work with proposed join for
def g1(a: Sequence[A]) -> None:
...
You could argue that user-defined classes are invariant by default, we already have a section in "Common issues" about this, so that users will be prepared that sometimes explicit annotations are needed.
I may well disagree with the rules for meet as well, but this needs some thought as the issue is subtle so I'm focusing on join first. By the way, I'm not convinced that this is analogous to the union simplification fix and this needs to argued for based on its merits.
One of the reasons why Any types are 'contagious' is that that it can highlight code that is unsafe. If something has type Any, using the value is potentially unsafe. If using something is potentially unsafe, inferring Any is generally the right thing to do, unless the user overrides this with an explicit annotation. The proposed rules for join would make mypy less safe in that sense. Example:
class A:
def f(self) -> None: pass
class B: pass
def f(a: A, b: Any) -> None:
x = [a, b]
x[1].f() # unsafe if b is an instance of B, for example
With the current semantics, the receiver object on the final line has type Any, so it's clear that the method call is unsafe, which is correct. With the proposed semantics, the type would be A, and the call would look safe, even though it's not. We can argue that inferring List[Any] for x is safer than List[A], since the actual runtime type could well be a supertype of A.
Another issue with the rules for joins is the behavior with isinstance:
def f(a: str, b: Any) -> None:
x = [a, b][0]
if isinstance(x, str):
...
else:
x(1 + '') # should be an error
If we infer str as the type of x, we get a false positive on the final line since mypy would think that the line unreachable because of the isinstance test.
Actually, it looks like inferring Union[str, Any] for x would actually conform to principles, but here the argument is different -- we try to avoid inferring union types since they are very unintuitive for users.
One of the reasons why
Anytypes are 'contagious' is that that it can highlight code that is unsafe
But the only way for a user to see this is to use reveal_type, or am I missing something? To me it looks quite opposite, with current semantics Any rather "hides" errors (leaves them unreported). For example, extending your example:
def f(a: A, b: Any) -> None:
x = [a, b]
x[1].f()
x[1].never_had_this()
if function_that_typically_returns_zero():
x[0].this_will_fail_late()
Another issue with the rules for joins is the behavior with
isinstance:
This example is interesting. Inferring Union[X, Any] for particular join(X, Any) is also interesting. Maybe we can just do this internally but simplify this in error messages?
I don't think this is an issue that should be resolved in isolation. We have a more thorough review of the type system on our roadmap for the next couple months -- we should make sure to consider the role of Any as part of that, but doing it beforehand would be duplicating work and could potentially lead us in the wrong direction.
We have a more thorough review of the type system on our roadmap for the next couple months
Ah, thanks, I didn't know this. Then I propose to postpone this discussion, but leave the issue open just to be sure we will not forget about this.
I wonder if the review should be done in public rather than as Dropbox meetings.
I wonder if the review should be done in public rather than as Dropbox meetings.
I can't say for others, but I would very much like it if I had the chance to _join_ :-)
our roadmap
By the way, is this roadmap a secret document, or is it available somewhere?
Our current roadmap for mypy is not public, but at least I wouldn't mind making it public -- with the caveat that it's continuously evolving.
Jukka, I think it's a great idea to publish the internal road map (I can't even recall what's on it). Maybe we can put it in a file named ROADMAP.md at the top level of the tree? Alternatively it could be a chapter in the docs. Another approach would be to create issues for each of the items and tie them together with a custom label (or, dare I say, a milestone?). We should recall that road maps are often vague and subject to change -- an item on the road map doesn't promise it will be delivered by a certain date.
How about putting it in the wiki? Just restrict editing to collaborators only.
We're actually trying to kill the wiki, it's full of unmaintained stuff.
We're actually trying to kill the wiki, it's full of unmaintained stuff.
FWIW I agree that the Developer Guides section is probably mostly crap at this point, but I don't really see why it wouldn't be an ideal place for a roadmap. It would have dead-easy editing from either the command line or the GUI, too.
Also:
http://www.mypy-lang.org/roadmap.html
This really needs to be updated:
Milestone 5: Faster, incremental type checking
Support more efficient, incremental type checking. After a code edit, mypy should only check modules affected by that change instead of the whole program. This will be useful especially for large projects.
Schedule: 2016
Milestone 6: Beta release
After this release, work harder to avoid changes that break backward compatibility in significant ways. > Most critical open issues resolved.
Schedule: 2016?
@JukkaL -- thoughts about the wiki and the outdated road map?
@kirbyfan64 -- that website is in GitHub, you can send PRs here: https://github.com/JukkaL/mypy-website
To get the discussion moving on, here is how the type system behaves currently:
(1) join(Any, object) == Any
(2) meet(Any, object) == object
(3) join(Any, C) == Any
(4) meet(Any, C) == C
(5) join(Any, NoReturn) == Any
(6) meet(Any, NoReturn) == NoReturn
If I were to change anything, I'd consider changing meets (this is just an idea, I haven't carefully considered all the implications):
(1) join(Any, object) == Any
(2) meet(Any, object) == Any <-- changed
(3) join(Any, C) == Any
(4) meet(Any, C) == Any <-- changed
(5) join(Any, NoReturn) == Any
(6) meet(Any, NoReturn) == NoReturn
And here is how I think about Any types and these can be used to argue for the above rules:
Any type is a type that mypy doesn't know. Usually it's a type the programmer hasn't specified (yet). Sometimes it's a type that can't be represented in the type system.Any types should rarely generate false positives. It's totally okay to generate false negatives. The main purpose of Any is to turn type errors we don't want to see into false negatives.Any is designed to be used while migrating code to static typing. In fully migrated code it should be used rarely. (In practice, fully migrating code is often impractical, but we are a bit idealistic here and assume that this gets much easier in the future.)Any type with Any should not generate additional errors (there are some exceptions but they should be very rare). This allows using libraries without stubs without resulting in many false positives. This also allows type checking only a subset of a codebase during migration to type checking -- imports from missing modules will result in Any types.I think we should add the is_subtype() to the puzzle (in mypy functions terminology, this is what is called compatible in the more common gradual typing terminology). The current rules are:
(1) is_subtype(Any, C) == True
(2) is_subtype(C, Any) == True
(3) is_subtype(NoReturn, C) == True
(4) is_subtype(C, object) == True
The rules for join/meet should satisfy their definitions i.e. is_subtype(meet(A, B), B) must be True etc. I think this is currently OK. The proposed change also seems fine.
I would propose to add a fifth rule to how to think about Any:
join(Any, C) == Any then this also holds ifC is a corner case like object or NoReturn.This rule already holds. I would propose to keep it, so that meet(Any, C) == Any for all C including NoReturn. Note however this conflicts with the idea proposed in #5818 to return False for is_subtype(Any, NoReturn).
Most helpful comment
I may well disagree with the rules for meet as well, but this needs some thought as the issue is subtle so I'm focusing on join first. By the way, I'm not convinced that this is analogous to the union simplification fix and this needs to argued for based on its merits.
One of the reasons why
Anytypes are 'contagious' is that that it can highlight code that is unsafe. If something has typeAny, using the value is potentially unsafe. If using something is potentially unsafe, inferringAnyis generally the right thing to do, unless the user overrides this with an explicit annotation. The proposed rules for join would make mypy less safe in that sense. Example:With the current semantics, the receiver object on the final line has type
Any, so it's clear that the method call is unsafe, which is correct. With the proposed semantics, the type would beA, and the call would look safe, even though it's not. We can argue that inferringList[Any]forxis safer thanList[A], since the actual runtime type could well be a supertype ofA.