Pylance-release: Issue about type annotation

Created on 11 Jan 2021 · 24Comments · Source: microsoft/pylance-release

I think we should allow type modification after function returns. This is essential for Union and subclass.
See 2 cases below:

from typing import Union

def func(x: Union[int, float]) -> Union[int, float]:
    return x

x: float = func(13.0)
# However, x is still Union[int, float] according to Pylance.

class A:
    pass

class B(A):
    utils: int = 13
    pass

def func2(x: A) -> A:
    return x

y: B = func2(B())
y.utils = 15  # However, y is still regarded as A type and doesn't have attribute utils according to Pylance.

fixed in next version

Source

ain-soph

All 24 comments

You have a misunderstanding of how type systems and type checking work. It can be confusing at first if you haven't used a typed language. For details of type checking within Python, please refer to PEP 483 and 484.

Let me explain some concepts based on the two examples you provided.

In your first example, the declared type of x is float. If it is assigned a subtype of float, it can be temporarily "narrowed" to that subtype. The type Union[int, float] is a narrowed version of float, so if you hover over x at that point in the code flow graph, it will show that narrowed type. If you were to add the statement x = 1 to the bottom of your first example and then hover over the x, you would see that it now has a narrowed type of Literal[1] at that point in the code flow graph. The type checker generally attempts to produce the narrowest type possible based on assignments and conditional checks. For more details about type narrowing, refer to this documentation.

In your second example, you have declared that y must be of type B, but you're attempting to assign it an instance of A, which is not a subtype of B. The type checker correctly reports this type violation.

If your intent is for func2 to accept any subclass of A and return the same subclass, then you would need to make it a generic function that uses a bound TypeVar, like this:

from typing import TypeVar

class A:
    pass

class B(A):
    utils: int = 13
    pass

_T = TypeVar("_T", bound=A)

def func2(x: _T) -> _T:
    return x

y: B = func2(B())
y.utils = 15

Hope that helps.

erictraut on 11 Jan 2021

@erictraut Thanks for your detailed explanation! It makes me clear about the type narrowing. And I realize that the declaration goes first, and then narrow to the method return type.

However, there still some cases confusing.

import numpy as np

c: np.ndarray = np.bincount([1, 1, 3, 3, 4])

If I run the program, np.bincount will return a np.ndarray for sure. So it is written in the docs.
However, the Pylance claims the return type will be a tuple because numpy doesn't write explicit return type annotation, and Pylance makes predictions from https://github.com/numpy/numpy/blob/e745a19cb0ea9af0fd252a4625ec793130f87414/numpy/core/multiarray.py#L952 but ignores the decorator function that further transforms tuple to np.ndarray.

This is certainly not correct and will cause violation, and should be fixed (by some stub pyi files perhaps?).
I just wonder from user side, can I manually fix it by something like c: np.ndarray = np.bincount([1, 1, 3, 3, 4]). (I know this is certainly not working now.) Currently, this declaration won't work and the type of c is still tuple for Pylance.

I still have questions about the Union and subclass cases. But I think I need to first go over the listed PEPs and come back to this issue later.

ain-soph on 11 Jan 2021

If you want to enable type checking, annotated libraries are very important. Pylance will do its best to infer types in the absence of annotations, but inference is prone to errors, and you'll run into many issues. Parts of numpy are even delivered as native libraries, so there's no Python code for Pylance to use as a source for inference. So you really need type stubs for type checking to work. Try data-science-types and see if that meets your needs.

erictraut on 12 Jan 2021

👍1

Yes, I admit that using stub could solve this problem for numpy, and that should be a standard that every library should have correct type annotation.
But the issue here is whether the user could handle that in their codes. I know the function return type annotation in some other libraries is actually not correct or missing, could I fix it in my code by a declaration that c is actually a np.ndarray?

Well, it seems not. And Pylance will keep the wrong prediction forever no matter what I do in the code (unless I write a stub elsewhere.).

I think it might need some feature improvement to support this?

ain-soph on 12 Jan 2021

Numpy support is tracked in #150. Numpy itself may ship with stubs soon (depending on their quality).

I'd think you'd be able to manually override the type and stick a # type: ignore on it, but if type checking is off you won't get any errors. I'd be surprised if it didn't respect your annotation.

jakebailey on 12 Jan 2021

@jakebailey
Thanks for your involvement. As you can see that the first tolist is white color and no linting because c is still regarded as a tuple, and I can't go to definition.
As an expected example, you can see e.tolist() where tolist is shown in yellow color and I can go to definition. This is because np.ones returns an Any.

ain-soph on 12 Jan 2021

This shows that annotating the variable works (what you described as a fix), it's just that ndarray doesn't have a typed tolist, which is a different problem.

jakebailey on 12 Jan 2021

This shows that annotating the variable works (what you described as a fix), it's just that ndarray doesn't have a typed tolist, which is a different problem.

Nope? Because e has tolist and I can go to definition and see the method implementation. Even though it actually goes into a pyi stub in vscode pylance plugin. But that's another issue I've seen in the issue list.

ain-soph on 12 Jan 2021

You can certainly declare the types of variables within your own code. However, you can't declare the return types of library functions in your own code. You would need to write a stub file to do that.

You can call the typing.cast function to override the type of any expression.

from typing import cast

# Cast is required because np.bincount function is not annotated.
c = cast(np.ndarray, np.bincount([1, 1, 3, 3, 4]))

erictraut on 12 Jan 2021

My mistake, I missed the yellow.

jakebailey on 12 Jan 2021

I guess I'm confused as to why annotating it doesn't work; I would have expected c to be a variable of type ndarray, but then an error be emitted at the assignment that says "this isn't the right type", then later in the code c is still the type it was declared as.

jakebailey on 12 Jan 2021

@erictraut
Yes, that's what I exactly mean. I think it's a little inconvenient for most users if they have to write stub or typing.cast.

To clarify, the expected behavior should be: If the function return type is conflict with the variable declaration, raise Error (or warning?), but the variable type should still be the declaration type.

So it is for the Union or subclass.

If the function return type is a wider range Union[float, int] and the declaration is float, I think the expected variable type should be float rather than Union[float, int], which is a more narrow type.
If the function return type is a wider range A and the declaration is B (B subclass A), I think the expected variable type should be B rather than A, which is a more narrow type.

But I think I might understand incorrectly for the word narrow, since it seems Union[float, int] is more narrow than float.

ain-soph on 12 Jan 2021

Most users don't care about static type checking. For those that do, Python is going to be inconvenient until library authors provide complete and accurate type annotations. We're doing our best to accelerate that.

Yes, you are correct about the expected behavior.

x: int = 3
reveal_type(x) # Literal[3]
x = 5.4 # This line generates an error because 5.4 is a float
reveal_type(x) # int

In the above example, the type of x is narrowed from int to Literal[3] because of the first assignment. The second assignment is an error because 5.4 is a float, not an int. The type of x reverts back to its declared type int after the illegal assignment.

erictraut on 12 Jan 2021

@erictraut
Thanks for your instance. But I don't quite agree with it? In python, after x = 5.4, it becomes a float, and pylance identifies that correctly. I think it's expected.
In C++, that will raise an error for sure.

ain-soph on 12 Jan 2021

You have typeCheckingMode set to "off". If you enable type checking (i.e. set python.analysis.typeCheckingMode to "basic"), you will see the following:

Screen Shot 2021-01-11 at 4 56 13 PM

If you're not interested in type checking and simply want completion suggestions and other language features, you should leave type checking off.

erictraut on 12 Jan 2021

Thanks, you are correct. I've tested with python.analysis.typeCheckingMode="basic" and it works. (But I get thousands of error reports in my project 😂 I thought I was a careful python programmer, so naive.)

So the current issue is to get the expected behavior for python.analysis.typeCheckingMode="off"?
(Personally, I think the type of x is float is very good when mode='off' in the previous example and we should just keep that.)

We just need to solve the conflict between variable type declaration and method return type as np.bincount example shows.
At least allow users to fix the wrong prediction by Pylance.

ain-soph on 12 Jan 2021

If type checking is off, you will not see any type-checking errors. You will receive completion suggestions and other language features but no type checking. Why do you care what types Pylance predicts as long as you're getting the best possible completion suggestions in most cases?

Pylance intentionally changes its behavior when type checking is disabled to favor better completion suggestions based on the types of the values assigned, versus the declared types. The assumption is that if type checking is disabled, you don't care about the declared types as much as the actual assigned types.

What you're proposing is that we don't change the behavior between the two modes. That may address the specific examples you've cited, but I think it will create even problems in more situations. In other words, I think such a change would be a net negative for most Pylance users. We'll discuss this internally and get back to you, but right now I'm not inclined to change the behavior.

erictraut on 12 Jan 2021

❤1

Yes. I don't care about the type-checking errors. I just want completion suggestions.
But now I have no auto-completion for c.tolist. That's the problem.

The problem is caused by Pylance getting wrong type predictions and users can't fix it manually without using stub or typing.cast.

I know it might be difficult to handle. Just hope we can finally get a good solution.

import numpy as np

c: np.ndarray = np.bincount([1, 1, 3, 3, 4])
c.tolist()

ain-soph on 12 Jan 2021

It's actually trivial to change the behavior and make it consistent. It simply involves deleting an if statement in the code here: https://github.com/microsoft/pyright/blob/f1da893256092f55549c1b38710bf4aec9a9f075/packages/pyright-internal/src/analyzer/typeEvaluator.ts#L2497. I'm still not convinced it's a good tradeoff though. I think it will fix some cases but break even more of them.

erictraut on 12 Jan 2021

Thanks a lot for your patient discussion. Hope it can get solved!
Sorry for not being able to get so deep into Pyright, but I'll keep watching on this issue for sure.

ain-soph on 12 Jan 2021

We had a good internal debate about the pros and cons of making this change. We decided to make the change and get feedback. If many Pylance users dislike it, we could revert it or look for some other solution.

erictraut on 12 Jan 2021

We had a good internal debate about the pros and cons of making this change. We decided to make the change and get feedback. If many Pylance users dislike it, we could revert it or look for some other solution.

You guys are really working efficiently! So damn quick, which violates my previous impression on Microsoft😂

ain-soph on 13 Jan 2021

This issue has been fixed in version 2021.1.1, which we've just released. You can find the changelog here: https://github.com/microsoft/pylance-release/blob/main/CHANGELOG.md#202111-13-january-2021

jakebailey on 14 Jan 2021

I personally support this change, I have type-checking off, and I prefer to have a way to force a type without using cast (by using the type annotation as in case 1 here).
I don't fully understand the cons of this change, but if I find any I'll report them here.