Vision: [feature proposal] Composing transformations with add magic method

Created on 24 Mar 2018 · 11Comments · Source: pytorch/vision

TLDR: transform = Resize((299, 299)) + ToTensor()

Motivation

Consider the common use case of building an overall-common-but-differing-at-one-point transformation pipeline for train and test sets, for instance applying augmentation transforms only to train set:

train_transform = transforms.Compose([
    Resize((299, 299), interpolation=1),
    RandomCrop(299, padding=4),
    RandomHorizontalFlip(),
    RandomGrayscale(),
    ToTensor(), 
    Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
])

test_transform = transforms.Compose([
    Resize((299, 299), interpolation=1), 
    ToTensor(), 
    Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
])

A way of expressing of train_transform builds upon test_transform would be handy. What you could do now is to recursively use transforms.Compose (i.e. Compose([Compose(...resizing...), Compose(...augmentation...], Normalize(...)])), but I don't find this particularly clean (and intuitively, final transforms objects should be sequential objects with only one level of depth).

Proposed behavior

common_transforms = ToTensor() + Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
augmentation = RandomCrop(299, padding=4) + RandomHorizontalFlip() + RandomGrayscale()
train_transforms =  Resize((299, 299)) + augmentation + common_transforms
test_transforms = Resize((299, 299)) + common_transforms

Note how this mimics the behavior of Python lists (and other iterables):

>>> [1, 2] + [3, 4]
[1, 2, 3, 4]

Implementation sketch

class Transform(object):
    # Right now all transforms directly inherit from object. 
    # Let Transform be a superclass for all transformations

    def __call__(self, pic):
        raise NotImplementedError('Each subclass should implement this method')

    def __repr__(self):
        raise NotImplementedError('Each subclass should implement this method')

    def __add__(self, other):
        if not isinstance(other, Transform):
            raise TypeError('Only transformations can be added')
        if isinstance(self, Compose) and isinstance(other, Compose):
            return Compose(self.transforms + other.transforms)
        if not isinstance(self, Compose) and isinstance(other, Compose):
            other.transforms = self + other.transforms
            return other
        if isinstance(self, Compose) and not isinstance(other, Compose):
            self.transforms = self.transforms + other
            return self
        if not isinstance(self, Compose) and not isinstance(other, Compose):
            return Compose([self, other])

Comments are most welcome!

enhancement needs discussion

Source

tomekkorbak

👍4

Most helpful comment

I wanted similar behaviour and ended up creating my own Compose class for this:

class Compose(list):
  def __call__(self, data):
    for t in self:
      data = t(data)
    return data

My Compose is basically a callable list... Not sure if it has any drawbacks _(it probably has)_, but this has the advantage of being familiar to most people, as it has all the methods lists have!
A few things you can do with it:

preprocess = Compose([...])

# Concatenate composes
preprocess += Compose([...])

# Concatenate Compose with a list
preprocess += [...]
preprocess.extend([...])
preprocess.append(...)

# Access/change individual elements
step1 = preprocess[0]
preprocess[1] = ...

As you can see, it doesn't allow you to add a function to a compose like @tomekkorbak wanted and also will not magically turn a sum of 2 transforms in a compose, but I do believe this is quite a clean implementation. All that is needed is to wrap your functions in a list and your done!

I really prefer my Compose, but would like to use the 'standard PyTorch way' as much as possible, so it would be cool to implement this into torchvision.

_EDIT_
You might want to also add a __coerce__(self, other) method to my Compose,
if you want to be able to perform operations like [...] + preprocess.
I don't really need that, so I didn't implement it as it would make everything slightly more complicated...

Example implementation _(untested)_:

def __coerce__(self, other):
  if isinstance(other, Iterable):
    return self, Compose(other)
  else:
    return None

0phoff on 12 Apr 2018

👍2

All 11 comments

While I think such a functionality might be somewhat useful (but a bit less efficient than an iterator on a list), I didn't quite get the drawbacks that you pointed in your message.
You could still have something like

common = T.Compose([T.ToTensor(),
                    T.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])])

augmentation = T.Compose([T.RandomCrop(299, padding=4),
                          T.RandomHorizontalFlip(),
                          T.RandomGrayscale()])

train_transforms =  T.Compose([T.Resize((299, 299)), augmentation, common_transforms])
test_transforms = T.Compose([T.Resize((299, 299)), common_transforms])

which is still fairly ok, and a bit more efficient than having a long concatenation of composes (that will happen due to the + always creating a new T.Compose).

But I'm not against such proposal. @alykhantejani what do you think?

fmassa on 24 Mar 2018

@fmassa, thanks for your feedback. I understand your reservations: this proposal does not extend existing functionality, but only provides a nicer syntax. As for advantages of the proposed API over your code snippet:

I find chaining slightly conceptually simpler than nesting
In more complex cases (i.e. validation over several similar-but-different transformation pipelines) it'd be more concise and readable.

As for efficiency, the code I provided was a first approximation; I guess creating new object could be avoided in most cases and we could end up with creating fewer objects compared to nesting Composes (we could prepend/append to a Compose.transforms whenever possible).

Nevertheless, I agree it is for the community to decide if this proposal adds any value.

tomekkorbak on 25 Mar 2018

I think this functionality would be welcome, especially as it's just adding syntactic sugar on top of existing functionality - as @fmassa showed, it can be achieved by nesting composes.

I think the __repr__ of this would be much nicer than a nested set of Composes.

As for speed, I think we should do some benchmarking before we try to optimize this as simple code is generally my preferred route.

alykhantejani on 26 Mar 2018

Sounds good. @tomekkorbak , if you are willing to a PR, it would be most welcome! :)

fmassa on 3 Apr 2018

From an outsider's perspective:

Learning and adopting a library becomes difficult if there are 3 ways of doing the same thing. There is already a Compose based API and a functional API. We can already use the functional API to compose transforms which achieves the same purpose as the + operator. Maybe we can add in the docs about how to compose functional calls to transform, instead of adding another operator?

karandwivedi42 on 5 Apr 2018

👍1

@karandwivedi42 the + operator is meant to be a syntactic sugar for the Compose API, and not to extend on the functional API.

But I agree that we should ideally have only one way of doing something.

fmassa on 6 Apr 2018

I wanted similar behaviour and ended up creating my own Compose class for this:

class Compose(list):
  def __call__(self, data):
    for t in self:
      data = t(data)
    return data

preprocess = Compose([...])

# Concatenate composes
preprocess += Compose([...])

# Concatenate Compose with a list
preprocess += [...]
preprocess.extend([...])
preprocess.append(...)

# Access/change individual elements
step1 = preprocess[0]
preprocess[1] = ...

I really prefer my Compose, but would like to use the 'standard PyTorch way' as much as possible, so it would be cool to implement this into torchvision.

Example implementation _(untested)_:

def __coerce__(self, other):
  if isinstance(other, Iterable):
    return self, Compose(other)
  else:
    return None

0phoff on 12 Apr 2018

👍2

Has this feature merged yet? Would be awesome if I can append transformations

iqDF on 25 Jul 2019

Has this feature merged yet? Would be awesome if I can append transformations

I created and use my own version of a Compose class in my library.
I wouldn't mind creating a PR for this, but nobody from the core team answered how they see this class evolving...

In the meantime, you could probably just copy my code.
Note that I dont have the coerce functionality I talked about, but that is only necessary if you want to preprend transformations: [...] + preprocess.

class Compose(list):
    """ This is lightnet's own version of :class:`torchvision.transforms.Compose`.

    Note:
        The reason we have our own version is because this one offers more freedom to the user.
        For all intends and purposes this class is just a list.
        This `Compose` version allows the user to access elements through index, append items, extend it with another list, etc.
        When calling instances of this class, it behaves just like :class:`torchvision.transforms.Compose`.

    Note:
        I proposed to change :class:`torchvision.transforms.Compose` to something similar to this version,
        which would render this class useless. In the meanwhile, we use our own version
        and you can track `the issue`_ to see if and when this comes to torchvision.

    .. _the issue: https://github.com/pytorch/vision/issues/456
    """
    def __call__(self, data):
        for tf in self:
            data = tf(data)
        return data

    def __repr__(self):
        format_string = self.__class__.__name__ + ' ['
        for tf in self:
            format_string += '\n  {tf}'
        format_string += '\n]'
        return format_string

0phoff on 25 Jul 2019

Hi @0phoff this LGTM. 👍 . I like it simple. Thanks man.

@fmassa Should we consider adding __add__(), __append(), or __extend__() to pytorch?

Imo, it suits with application of action recognition model or any model that has many preprocessing options for their compatible dataset. We have a lot of modalities which preprocessings and transforms can be basic but different for model-modality combination.

Of course it would help in general.

iqDF on 25 Jul 2019

ping =] seems useful to me!