Vision: Type annotations

Created on 28 Mar 2020  路  15Comments  路  Source: pytorch/vision

馃殌 Feature

Type annotations.

Motivation

Right now, if a project depends on torchvision and you want to static type check it, you have no choice but either ignore it or write your own stubs. torch already has partial support for type annotations.

Pitch

We could add type annotations for torchvision as well.

Additional context

If we want this, I would take that up.

enhancement help wanted

Most helpful comment

@pmeier
1 - let's do the annotations inline. I think the reason PyTorch used the .pyi file for the annotations was due to Python2 compatibility (but I might be wrong). @t-vi do you have thoughts on this?
2 - let's split it in smaller chunks which are easier to review. It doesn't even need to be the full module at once, for example transforms is very big, can we can send a few functions at a time.

All 15 comments

Yes, please! Python-3 type annotations to torchvision would be great!

Two questions:

  1. Should I stick to stubs, i.e. create an .pyi file for every .py file, or do we want the annotations inline?
  2. Do you want me to do this in one gigantic PR or should I split it by module or package?

@pmeier
1 - let's do the annotations inline. I think the reason PyTorch used the .pyi file for the annotations was due to Python2 compatibility (but I might be wrong). @t-vi do you have thoughts on this?
2 - let's split it in smaller chunks which are easier to review. It doesn't even need to be the full module at once, for example transforms is very big, can we can send a few functions at a time.

Type annotations ToDo

  • [x] utils.py (#2034)
  • [ ] datasets
  • [ ] transforms

    • [ ] functional.py

    • [ ] transforms.py

  • [ ] models
  • [ ] io
  • [ ] ops

I don't know if there is already an internal discussion about this, but in the light of inline type annotations I think we really should talk about the code format. With type annotations the signatures get quite big and with in our current style this not only looks ugly but also hinders legibility:

def save_image(tensor: Union[torch.Tensor, Sequence[torch.Tensor]], fp: Union[str, io.FileIO, io.BytesIO],
               nrow: int = 8, padding: int = 2, normalize: bool = False, range: Optional[Tuple[int, int]] = None,
               scale_each: bool = False, pad_value: int = 0, format: Optional[str, io.FileIO] = None) -> None:

The same signature formatted with black looks like this

def save_image(
    tensor: Union[torch.Tensor, Sequence[torch.Tensor]],
    fp: Union[str, io.FileIO, io.BytesIO],
    nrow: int = 8,
    padding: int = 2,
    normalize: bool = False,
    range: Optional[Tuple[int, int]] = None,
    scale_each: bool = False,
    pad_value: int = 0,
    format: Optional[str, io.FileIO] = None,
) -> None:

which (subjectively) looks better and is more legible.


Note that I'm not advocating specifically for black, but rather for a code formatter or more general a unified code format.

I'm probably not the best person to ask because I never liked the pyi files for Python-implemented bits in the first place. That said, my understanding is that think it's fine to do them inline now.

Regarding the formatting: I do agree that cramming as much as possible in each line probably isn't the best way. You might see if PyTorch has a formatting preference in its formatting checks and what's the generally recommended Python way (I never know).

Thanks for the input @t-vi!

@pmeier good point about the code getting messy with type annotations. I don't have a good answer now, although indeed the output of black does seem more readable.

Let me add @cpuhrsch @vincentqb and @zhangguanheng66 for discussion as well

I second the use of black-style inline type hint formatting.

Yes. black makes sense to me as well.

This would also make reviewing easier. If you have a look at this review comment is unclear at first glance which parameter is meant since GitHub only allows to review a complete line.

Although black is a nice formatter, I would be careful with re-formatting the whole codebase at once -- this effectively messes up with blame so that we don't know anymore what was added when and by who.

The author talked very briefly about integrating black into an existing codebase. I've had no contact with blame so its hard for me to asses if hyper-blame could work for us. Could you have a look?

The problem is that most of the users (including Github UI) most probably only uses git blame, so asking it to be changed to hyper-blame is a bit hard

As we will touching every signature while adding annotations we could simply format them with black and leave everything else as is. This way git blame should still work, but we would enhance legibility. If we want to adopt black for the complete codebase we could do so by one commit at a time later.

@pmeier I think we could adopt the function-signature style of black, but leave the rest of the code-base unchanged? If that's what you proposed, I'm ok with it!

Was this page helpful?
0 / 5 - 0 ratings

Related issues

datumbox picture datumbox  路  3Comments

a-maci picture a-maci  路  3Comments

Wadaboa picture Wadaboa  路  3Comments

IssamLaradji picture IssamLaradji  路  3Comments

chinglamchoi picture chinglamchoi  路  3Comments