TLDR: transform = Resize((299, 299)) + ToTensor()
Consider the common use case of building an overall-common-but-differing-at-one-point transformation pipeline for train and test sets, for instance applying augmentation transforms only to train set:
train_transform = transforms.Compose([
Resize((299, 299), interpolation=1),
RandomCrop(299, padding=4),
RandomHorizontalFlip(),
RandomGrayscale(),
ToTensor(),
Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
])
test_transform = transforms.Compose([
Resize((299, 299), interpolation=1),
ToTensor(),
Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
])
A way of expressing of train_transform builds upon test_transform would be handy. What you could do now is to recursively use transforms.Compose (i.e. Compose([Compose(...resizing...), Compose(...augmentation...], Normalize(...)])), but I don't find this particularly clean (and intuitively, final transforms objects should be sequential objects with only one level of depth).
common_transforms = ToTensor() + Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
augmentation = RandomCrop(299, padding=4) + RandomHorizontalFlip() + RandomGrayscale()
train_transforms = Resize((299, 299)) + augmentation + common_transforms
test_transforms = Resize((299, 299)) + common_transforms
Note how this mimics the behavior of Python lists (and other iterables):
>>> [1, 2] + [3, 4]
[1, 2, 3, 4]
class Transform(object):
# Right now all transforms directly inherit from object.
# Let Transform be a superclass for all transformations
def __call__(self, pic):
raise NotImplementedError('Each subclass should implement this method')
def __repr__(self):
raise NotImplementedError('Each subclass should implement this method')
def __add__(self, other):
if not isinstance(other, Transform):
raise TypeError('Only transformations can be added')
if isinstance(self, Compose) and isinstance(other, Compose):
return Compose(self.transforms + other.transforms)
if not isinstance(self, Compose) and isinstance(other, Compose):
other.transforms = self + other.transforms
return other
if isinstance(self, Compose) and not isinstance(other, Compose):
self.transforms = self.transforms + other
return self
if not isinstance(self, Compose) and not isinstance(other, Compose):
return Compose([self, other])
Comments are most welcome!
While I think such a functionality might be somewhat useful (but a bit less efficient than an iterator on a list), I didn't quite get the drawbacks that you pointed in your message.
You could still have something like
common = T.Compose([T.ToTensor(),
T.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])])
augmentation = T.Compose([T.RandomCrop(299, padding=4),
T.RandomHorizontalFlip(),
T.RandomGrayscale()])
train_transforms = T.Compose([T.Resize((299, 299)), augmentation, common_transforms])
test_transforms = T.Compose([T.Resize((299, 299)), common_transforms])
which is still fairly ok, and a bit more efficient than having a long concatenation of composes (that will happen due to the + always creating a new T.Compose).
But I'm not against such proposal. @alykhantejani what do you think?
@fmassa, thanks for your feedback. I understand your reservations: this proposal does not extend existing functionality, but only provides a nicer syntax. As for advantages of the proposed API over your code snippet:
As for efficiency, the code I provided was a first approximation; I guess creating new object could be avoided in most cases and we could end up with creating fewer objects compared to nesting Composes (we could prepend/append to a Compose.transforms whenever possible).
Nevertheless, I agree it is for the community to decide if this proposal adds any value.
I think this functionality would be welcome, especially as it's just adding syntactic sugar on top of existing functionality - as @fmassa showed, it can be achieved by nesting composes.
I think the __repr__ of this would be much nicer than a nested set of Composes.
As for speed, I think we should do some benchmarking before we try to optimize this as simple code is generally my preferred route.
Sounds good. @tomekkorbak , if you are willing to a PR, it would be most welcome! :)
From an outsider's perspective:
Learning and adopting a library becomes difficult if there are 3 ways of doing the same thing. There is already a Compose based API and a functional API. We can already use the functional API to compose transforms which achieves the same purpose as the + operator. Maybe we can add in the docs about how to compose functional calls to transform, instead of adding another operator?
@karandwivedi42 the + operator is meant to be a syntactic sugar for the Compose API, and not to extend on the functional API.
But I agree that we should ideally have only one way of doing something.
I wanted similar behaviour and ended up creating my own Compose class for this:
class Compose(list):
def __call__(self, data):
for t in self:
data = t(data)
return data
My Compose is basically a callable list... Not sure if it has any drawbacks _(it probably has)_, but this has the advantage of being familiar to most people, as it has all the methods lists have!
A few things you can do with it:
preprocess = Compose([...])
# Concatenate composes
preprocess += Compose([...])
# Concatenate Compose with a list
preprocess += [...]
preprocess.extend([...])
preprocess.append(...)
# Access/change individual elements
step1 = preprocess[0]
preprocess[1] = ...
As you can see, it doesn't allow you to add a function to a compose like @tomekkorbak wanted and also will not magically turn a sum of 2 transforms in a compose, but I do believe this is quite a clean implementation. All that is needed is to wrap your functions in a list and your done!
I really prefer my Compose, but would like to use the 'standard PyTorch way' as much as possible, so it would be cool to implement this into torchvision.
_EDIT_
You might want to also add a __coerce__(self, other) method to my Compose,
if you want to be able to perform operations like [...] + preprocess.
I don't really need that, so I didn't implement it as it would make everything slightly more complicated...
Example implementation _(untested)_:
def __coerce__(self, other):
if isinstance(other, Iterable):
return self, Compose(other)
else:
return None
Has this feature merged yet? Would be awesome if I can append transformations
Has this feature merged yet? Would be awesome if I can append transformations
I created and use my own version of a Compose class in my library.
I wouldn't mind creating a PR for this, but nobody from the core team answered how they see this class evolving...
In the meantime, you could probably just copy my code.
Note that I dont have the coerce functionality I talked about, but that is only necessary if you want to preprend transformations: [...] + preprocess.
class Compose(list):
""" This is lightnet's own version of :class:`torchvision.transforms.Compose`.
Note:
The reason we have our own version is because this one offers more freedom to the user.
For all intends and purposes this class is just a list.
This `Compose` version allows the user to access elements through index, append items, extend it with another list, etc.
When calling instances of this class, it behaves just like :class:`torchvision.transforms.Compose`.
Note:
I proposed to change :class:`torchvision.transforms.Compose` to something similar to this version,
which would render this class useless. In the meanwhile, we use our own version
and you can track `the issue`_ to see if and when this comes to torchvision.
.. _the issue: https://github.com/pytorch/vision/issues/456
"""
def __call__(self, data):
for tf in self:
data = tf(data)
return data
def __repr__(self):
format_string = self.__class__.__name__ + ' ['
for tf in self:
format_string += '\n {tf}'
format_string += '\n]'
return format_string
Hi @0phoff this LGTM. 馃憤 . I like it simple. Thanks man.
@fmassa Should we consider adding __add__(), __append(), or __extend__() to pytorch?
Imo, it suits with application of action recognition model or any model that has many preprocessing options for their compatible dataset. We have a lot of modalities which preprocessings and transforms can be basic but different for model-modality combination.
Of course it would help in general.
ping =] seems useful to me!
Most helpful comment
I wanted similar behaviour and ended up creating my own
Composeclass for this:My
Composeis basically a callable list... Not sure if it has any drawbacks _(it probably has)_, but this has the advantage of being familiar to most people, as it has all the methods lists have!A few things you can do with it:
As you can see, it doesn't allow you to add a function to a compose like @tomekkorbak wanted and also will not magically turn a sum of 2 transforms in a compose, but I do believe this is quite a clean implementation. All that is needed is to wrap your functions in a list and your done!
I really prefer my
Compose, but would like to use the 'standard PyTorch way' as much as possible, so it would be cool to implement this into torchvision._EDIT_
You might want to also add a
__coerce__(self, other)method to myCompose,if you want to be able to perform operations like
[...] + preprocess.I don't really need that, so I didn't implement it as it would make everything slightly more complicated...
Example implementation _(untested)_: