Mypy: Mypy errors with variable reuse with different types

Created on 27 Jan 2016 · 23Comments · Source: python/mypy

from typing import *

class Test(object):
    x = 1
    y = 2

for sub in [(1, 2)]:
    pass

subs = {} # type: Dict[Tuple[int, int], Test]                                                                                 
for sub in [Test()]:
    subs[(sub.x, sub.y)] = sub

gives

/home/tabbott/foo.py:11: error: Incompatible types in assignment (expression has type "Test", variable has type "Tuple[int, int]")
/home/tabbott/foo.py:12: error: Tuple[int, ...] has no attribute "x"
/home/tabbott/foo.py:12: error: Tuple[int, ...] has no attribute "y"
/home/tabbott/foo.py:12: error: Incompatible types in assignment

Arguably this is not perfect code, but probably this shouldn't be a mypy error?

false-positive needs discussion priority-0-high topic-usability

Source

timabbott

👍7 😕1

Most helpful comment

Increasing priority since this is such a common issue.

JukkaL on 20 Feb 2018

🎉14 👍6

All 23 comments

I think this is a duplicate of a common theme: variable reuse within one function body. Mypy really prefers that you don't do that. (I can't find a specific issue to point to though.)

gvanrossum on 27 Jan 2016

Yeah, this one has been discussed many times though there doesn't seem to be a github issue for this.

I'm not really sure about what's the best way to approach this. Perhaps mypy should allow redefining a variable with a different type by default and provide a strict mode or option which disallows this.

A general solution would be to logically define a fresh variable each time an assignment redefines a variable on all possible code paths. This would define a new variable, since the old definition can't reach beyond the second assignment:

if c:
    x = 1
x = ''  # new variable x

This would not define a new variable:

x = 1
if c:
    x = ''  # does not define a new variable (error)
print(x)

Not sure about his:

x = 1
print(x)
if c:
    x = ''  # ??
    print(x)
# x not read here any more

A less general approach would be to only allow redefining with a new type only if the original definition didn't have a type annotation and the new type definition doesn't have a type annotation. So this would be okay:

x = 1
...
x = ''

But this would not be okay because the annotation is lying:

def f(x: int) -> None:
    x = str(x)  # error?

And this would not be okay:

x = []  # type: List[int]
...
x = 3  # error?

JukkaL on 4 Feb 2016

For what it's worth, pytype always allows you do redefine a variable with a different type. We're essentially doing SSA.

On the other hand, that also means that pytype considers code like this correct:

def f(x: int):
  x = "foo"

I know it looks odd, but allowing this pattern is is quite useful when adding function annotations to existing code.

matthiaskramm on 13 Apr 2016

I propose moving this to an earlier milestone such as 0.4.0, at least tentatively, since issues like these a little painful to refactor, and they can generate a lot of noise. This is more of a style issue than correctness issue, and I'd rather not make mypy very opinionated about style issues. An optional flag that causes mypy to complain about questionable redefinitions would be better in my opinion.

JukkaL on 14 Apr 2016

Have we run into this issue a lot internally? It looks to me like this'll be a lot of work -- we'd have to start doing a significant amount of control flow analysis, and I expect doing that properly will be somewhat difficult.

ddfisher on 14 Apr 2016

I've hit this in some open source code I've experimented with. If we run
mypy against a lot of code without annotations using --implicit-any we
should get an estimate of the prevalence via type errors.
On Thu, Apr 14, 2016 at 21:07 David Fisher [email protected] wrote:

Have we run into this issue a lot internally? It looks to me like this'll
be a lot of work -- we'd have to start doing a significant amount of
control flow analysis.

—
You are receiving this because you commented.
Reply to this email directly or view it on GitHub
https://github.com/python/mypy/issues/1174#issuecomment-210126652

JukkaL on 14 Apr 2016

When you're checking this out, I'd recommend at least also checking the tests directories for the projects; my experience was that our test code had a much higher ratio of these than the rest of the codebase.

timabbott on 14 Apr 2016

Also just FYI --implicit-any is now --check-untyped-defs.

ddfisher on 14 Apr 2016

It's --check-untyped-defs now.

I ran this over a small corpus (under 150K LOC) and found, among a total of over 1200 errors, about 30 occurrences of this message (though 9 of these didn't have the (expression has type "xxx", variable has type "yyy") suffix -- those were from assignments into container items like x[a] = b).

A bunch (indeed most common in tests) were unrelated types, but in many cases arguably the error was due to a too narrowly inferred initial type. E.g.

x = randint()
x = randint() / 1000.0

A bunch were similar but the two types involved were different subclasses of the same superclass. Common was also assignment of differently sized tuples to a variable, e.g.

x = ()
x = (1,)
x = (1, 2)

I've also seen things like

if ...:
    iter_ = [some list]
else:
    iter_ = <some iterable>
for i in iter_: ...

Other complicating factors were that a few times the assignments occurred in except clauses or other "blocks" where the variable wasn't used after exiting the blocks.

All in all I do think this is a popular idiom that we ought to support better.

gvanrossum on 14 Apr 2016

Perhaps we could start with special casing some common idioms instead of trying to support all of the possible idioms. Here my concern is that the rules that mypy follows should be easy to describe and predictable, and approaches like SSA wouldn't really fit this description.

Here are some idioms that would be easy enough to support and that I think are pretty common.

1) Redefined in the same block

Examples:

def f(x: int) -> None:
    x = str(int)
    print(x)

y = 1
print(y)
y = ''
print(y)

Variables within for, while and try (what about with?) statements would be harder as an intermediate value may escape.

Here nested blocks would be considered separate blocks, and we consider function arguments to be defined in the same block as the function body.

2) Variable never read outside statement that defines it

Examples:

for x in (1, 2) : ...
for x in ('a', 'b'): ...
# x not read outside 'for' statements

_ = 1  # variable defined but never read
_ = ''

Also 'with'. For try/except this might already work. For module-level variables we'd take the last definition as the type of the variable, but if there is a conditional definition things get harder.

3) Conditional definitions don't agree on type

This is actually a separate issue as here we have to infer a single type for a variable. Example (from above):

if ...:
    x = [...]
else:
    x = <iterable>
y = x

JukkaL on 15 Apr 2016

👍1

Sounds like a common theme is that if variables are defined and redefined at the same indentation level ("in the same block") and there's no local flow control in between we should check the code between the definitions using the first definition and after that use the second (last) definition. E.g.

if cond():
    x = 1
    x+1
    x = ''
else:
    x = [1, 2, 3]
    x.append(4)
    x = ''

Should work because the ultimate type in each block is the same.

However this is questionable:

while cond():
    x = f()
    if x < 0:
        break
    x = ''
# The type of x here could be int or str.

In a try block everything is questionable:

try:
    x = 0
    foo(x)
    x = ''
except Exception:
    pass
# Here x could be int or str.

A future improvement could not care if there's no use afterwards.

For branches that don't agree we could have some kind of unification (like I want for conditional expressions e.g. #1094).

gvanrossum on 15 Apr 2016

Moved to 0.5 (in some generality).

JukkaL on 16 Apr 2016

@rhettinger showed me an example which redefines a set as a frozenset, and apparently he sees cases similar to this pretty often. Here is a sketch (I may have forgotten about some specifics, though):

def f() -> None:
    items: Set[int] = set()
    # populate s with values
    items: FrozenSet[int] = frozenset(items)
    # do stuff with items; we want to make sure it doesn't get mutated here

JukkaL on 30 May 2017

👍6

Increasing priority since this is such a common issue.

JukkaL on 20 Feb 2018

🎉14 👍6

Hi all - is there any good way to proceed resolving this issue? Thanks

dsully on 1 Aug 2018

You could submit a PR.

gvanrossum on 1 Aug 2018

❤1

I would like to suggest a way to allow explicit reassignment, without the hassle of specifying the whole type each time. Basically you'd be allowed to reassign a variable with a different type when you explicitly specify that type like

x: int = 5
x: List[int] = [x]  # allowed
x = "Hello world!"  # not allowed

but there would also be a special type (a bit like there is a special type for type variables) that allows you to explicitly infer the type:

from typing import Infer as _

x: int = 5
x: _ = [x]

That would also allow you to assert only part of a type:

from typing import Infer as _

x: int = 5
x: List[_] = [x]

zroug on 20 Aug 2018

👍2

If this issue is "fixed" can restoring the old behavior be available via a strictness flag in mypy.ini? I find that such cases of overriding a variable's type with a new one to be nearly always something I don't want to do intentionally (easy to accidentally do it by e.g. replacing a set with a list, then get terrible performance for in for the rest of the function) and, even if necessary, just makes the code harder to read because I'm less sure of the type of a particular name at any given point in time. (For the same reason I don't ever use variable shadowing in other languages)

jhance on 21 Sep 2018

@jhance A strictness flag to enable the old behavior is possible, but it might not be included initially. I'm currently planning to only support redefinition in the same block, as this seems like the most common case and it's the easiest one to support.

JukkaL on 24 Sep 2018

Basic implementation was added in #6197, behind a flag. At the moment only simple cases where the redefinition happens in the same block as the original definition, and at the same nesting level, are supported. I'll create follow-up issues to make this more general. I'd still like to enable this by default at some point.

JukkaL on 21 Jan 2019

🎉3

Is there any solution? I am using mypy 0.770, I am porting python2 code where I have hundred cases like the one below. I cannot afford giving new name to every variable - it will introduce a lot of changes that will be really hard to verify.

I am ok with adding code or "# type: " directive at the "Point X". I tried using 'del val' it did not work. All I want is to forget the type of this variable at the point x (I want to keep type information in between though). And the 'val' could be a class.

def func_5() -> int:
    return 5

def func_x() -> str:
    return "x"


def example(t1: bool, t2: bool) -> bool:
    if t1:
        val = func_5()
        if val != 5:
            return False

    # Point X

    if t2:
        val = func_x()
        if val != "x":
            return False

    return True

alex4747-pub on 14 Nov 2020

@alex4747-pub try --allow-redefinition.

ethanhs on 14 Nov 2020

Did not help:
mypy --allow-redefinition example.py
example.py:18: error: Incompatible types in assignment (expression has type "str", variable has type "int")
Found 1 error in 1 file (checked 1 source file)

alex4747-pub on 14 Nov 2020

Was this page helpful?

0 / 5 - 0 ratings