Vision: Clean up the documentation of transforms

Created on 1 Dec 2020 · 6Comments · Source: pytorch/vision

📚 Documentation

The methods in functional_pil.py and functional_tensor.py are private but contain a lot of duplicate documentation that is not currently used. It's worth cleaning this up and updating the documentation on functional.py to highlight potential differences or limitations between the two backends.

Here is an example of how we typically highlight backend differences in TorchVision:
https://github.com/pytorch/vision/blob/1b83f46c42a062429ec0604ca8c1beae665790cd/torchvision/transforms/transforms.py#L332-L333

Moreover as per @voldemortX's comment (see https://github.com/pytorch/vision/issues/3071#issuecomment-748741006), some of the limitations listed in the pydocs are inaccurate and they have already been resolved. These need to be corrected.

enhancement good first issue documentation

Source

datumbox

All 6 comments

Hi! In the cleanup process, if you guys find functional limitations of tensor transforms (i.e. something PIL can do that tensors can't), maybe I could work on them?

voldemortX on 19 Dec 2020

Hi @voldemortX, you are very welcome to do so! I think we already mention these limitations on the transforms.py pydoc linked above, if you want to have a go. :)

datumbox on 19 Dec 2020

🎉1

Hi @voldemortX, you are very welcome to do so! I think we already mention these limitations on the transforms.py pydoc linked above, if you want to have a go. :)

Cool! I'll investigate and summarize what can be done soon.

voldemortX on 19 Dec 2020

❤1

@datumbox I checked the current documentation in transforms.py and found the following tensor limitations:

Symmetric padding mode support: Pad, RandomCrop (seems already supported in #2749, #2373 but forget to update the doc).
JIT unsupported for Lambda, RandomOrder, RandomChoice (seems not addressable yet).
Tensor interpolation mode only support nearest, bilinear, bicubic: Resize, RandomResizedCrop (limited by pytorch interpolate()).
Tensor interpolation mode only support nearest, bilinear: RandomPerspective, RandomRotation, RandomAffine (limited by pytorch grid_sample()).

Maybe I can try something for 3 and 4, after some thorough investigation about interpolation modes.

EDIT:
It seems pytorch now supports bicubic in gird_sample() here. I'm not sure whether its wise to wait for the other modes to be supported in pytorch or should I find a workaround.

voldemortX on 21 Dec 2020

@voldemortX Thanks for the deep dive!

Great work flagging the issue on point 1, I'll update this ticket's description to include it. Unfortunately JIT still does not support Lambdas. Given that points 3 & 4 are quite big and not related to this very specific issue, I would recommend opening a RFC issue where you can outline the potential solutions and request feedback.

Finally, if you are interested in fixing the problems listed in this issue, feel free to send a PR.

datumbox on 21 Dec 2020

Unfortunately JIT still does not support Lambdas. Given that points 3 & 4 are quite big and not related to this very specific issue, I would recommend opening a RFC issue where you can outline the potential solutions and request feedback.

At the moment I don't have a elegant workaround for interpolations modes, I think maybe if me or somebody want to do this, maybe it's better done in pytorch than in torchvision.