Pylance-release: Support changing a variable's type

Created on 18 Aug 2020  路  10Comments  路  Source: microsoft/pylance-release

If I have a variable of known type and want that name to now reference an object of another type, pylance complains. E.g.:

def parse_int(x: str):
    x = int(x)
    return x

gives

Expression of type "int" cannot be assigned to declared type "str"
  "int" is incompatible with "str" Pylance (reportGeneralTypeIssues)

I can see how this would sometimes be useful to catch, but I would personally prefer for this not to be seen as an issue. I did try to explicitly declare the type the second time, to show that I'm intentionally changing it, but that just gives a different error:

def parse_int(x: str):
    x: int = int(x)
    return x
Parameter declaration "x" is obscured by a declaration of the same name Pylance (reportGeneralTypeIssues)

I believe mypy supports an option to allow such type changes; could support for this be added to pylance?

(I'm not sure if this request should be on pylance or pyright; let me know if I'm in the wrong place or if there's already a configuration option I can change somewhere to achieve this)

enhancement

All 10 comments

The current behavior is correct. You are declaring that the symbol x is type str. Any attempt to assign it a value that is not compatible with this type must be considered a type violation.

The correct way to code this is to choose a new symbol name. Even without type checking, this good coding practice.

Closing old issue, please reopen if this is still an issue.

I have just hit this issue as well.

I understand @erictraut 's reasoning. It makes sense and also many languages like C# and Typescript is designed with this reasoning in mind where redeclaration of variables in the same scope is forbidden.

On the other hand, languages like F# and Rust not only allow shadowing local declarations but encourage this design (unsure about Rust, but totally sure about F#). Naming variables is hard, and coming up with new names in cases where different types are involved to represent the same concept can also be error prone.

I have the following use-case in mind (it's more tricky in reality, I have simplified for brevity):
I have types like X to represent untrusted input, and e.g. ProcessedX to represent values which passed validation, and transformed to a canonical representation, etc. I'd like to name such Xs x because they represent the same concept. Shadowing the previous declaration in this case removes the possibility to use the untrusted value after the validation occurred.

To be clear, if you provide no type declaration for a variable, pyright allows it to take on any value. For example, this is fine:

def parse_int(x):
    x = int(x)
    return x

But once you specify a variable's type, it's the job of a type checker to verify that any assignment to that value matches that type.

def parse_int(x: str): # x is defined as str
    x = int(x) # Error: not a str!
    return x

In the use case you described (validating untrusted input), you can rely on type narrowing:

def process_input(input: object): # Input can be anything
    if isinstance(input, str):
       reveal_type(input) # Type is narrowed to str

You may also be interested in user-defined type guards, which are defined in draft PEP 647 and are already implemented in pylance/pyright.

I have this issue as well. I don't want the type checker to complain when I am explicitly reassigning a new value to the variable.
I want it to complain only if it happens implicitly like when passing arguments to functions.

As @vlaci mentioned above, both F# and Rust allow us to do this:

let x = 5;
let x = "Some string";

And the type checker is smart enough to treat x as int before but String later.
I guess one problem with python is that it doesn't have a "let" keyword to declare that we are explicitly reassigning, so by default it is a good choice to complain like what pyright/pylance does.
But it would also be good to have an setting to allow changing type on explicit reassignment but still check on implicit reassignment.

If a variable or parameter does not have an explicit type annotation, pyright will allow you to assign (and reassign) any type to it. So this is allowed by pyright:

x = 5
x = "Some String"

I'll note that mypy (another popular Python type checker) does not allow this. It always "fixes" the variable's type based on the first assignment. There's no good reason to be this restrictive (that is, it doesn't provide any additional type safety), so pyright allows more flexibility here.

However, when a variable or parameter is explicitly annotated, it's the job of a type checker to enforce type consistency based on that annotation.

x: int = 5
x = "Some String" # Type consistency error

The problem with the first way (no annotations) is that when defining functions, not giving type annotations is equivalent to just not doing type checking at all (and pylance will complain about Unknown types).
It is good only for local variables that do not depend on the function parameters.

I still think that the default choice of giving type errors in these cases is fine. But there should be a way to disable that by a setting or maybe by explicit type annotation like this:

x: int = 5
x = "Some string"  # Type consistency error
x: str = "Some string"  # Not a type consistency error because I have explicitly annotated

But the third line also gives error saying that the previous declaration x is obscured by a new declaration of same name. This is reported as the diagnostic type "reportGeneralTypeIssues". Disabling that will disable almost all type checking like wrong parameter type passed to function. Maybe this should be a separate type of issue that we can disable in the settings.

It doesn't make sense to declare a variable as one type and then declare it as a different type within the same scope. Code flow is not always linear. The presence of loops and conditional statements would make the meaning ambiguous.

Consider the following:

x: int = 3

if some_condition:
   x: str = "Some String"

# What type should be enforced at this point? Should this
# be considered legal or illegal? It's ambiguous.
x = 5

while some_other_condition:
    x: bytes = b""

# Same thing here. This is ambiguous.
x = "hi"

For these reasons, it doesn't make sense to allow a variable to be declared with different types within the same scope.

Oh these are good cases that I didn't think about. Thanks.
So the problem is that python conditionals and loops don't create a new scope.
This is a problem Rust and F# didn't have because they create a nested scope for conditionals and loops.

However I just found that mypy has a flag --allow-redefinition
https://github.com/python/mypy/pull/6197
https://github.com/python/mypy/pull/6871
It gets around the problem of loops and conditionals by allowing redefinition only at the same indentation level.
Plus it add extra restriction that the variable should be read once before redefining.
Do you think it would be good to add something like that in pylance?

Yes, you're correct that most other languages have support for nested scopes. Python's scoping rules are very baroque and internally inconsistent (e.g. most loops don't introduce a new scope but list comprehensions do), but they are what they are.

The "--allow-redefinition" mode was added to mypy as a way to work around the fact that it normally "fixes" the variable's type after the first assignment. Pyright doesn't do that, so there's no reason to work around that issue.

As for allowing a variable to be annotated multiple times with conflicting types, we don't have any intention of supporting that in pyright. It would violate type safety, go against the rules of PEP 484, and undermine a bunch of other assumptions within pyright's core type checking engine.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

PedroMDuarte picture PedroMDuarte  路  5Comments

ciaranjudge picture ciaranjudge  路  3Comments

peach-lasagna picture peach-lasagna  路  3Comments

tweakimp picture tweakimp  路  3Comments

DannyNemer picture DannyNemer  路  3Comments