Vision: [Feature_Request] ZCA/PCA whitening

Created on 6 Sep 2017 · 4Comments · Source: pytorch/vision

I noticed that transforms are missing whitening preprocessing, that is very important for CNNs.
Since I need it for one of my projects, I'm willing to implement it, but I noticed some problems and I'd like to hear what you have to say about them.

We need to apply the same whitening matrix and mean computed on the train set to the test set.

When doing minibatches, we just compute them on one batch of the train dataset and then use them for every other batch or do I compute different means and matrices for each minibatch? In this case, which do we use on the test set?
We need to save the matrix somewhere to reuse it on the test set. My solution would be to do something like:

class ZCA(object):

    def __init__(self, mean=None, P=None):
        self.mean = mean
        self.P = P

    def __call__(self, tensor):
        if not self.mean:
            #compute mean
        if not self.P:
            #compute P
        # compute whitened samples
        return tensor

This works in the case we just compute mean and P on the first minibatch. Also, P and mean of the train set would be accessible for use on the test set through the parameters, provided that the training set has already been preprocessed.

Finally: what if the last minibatch has a different size? Also, the last minibatch of the train and test set could have different size...

enhancement help wanted

Source

iacolippo

Most helpful comment

Hi @iacolippo

When doing PCA/ZCA whitening you will have your training set of shape X = [N x D] (N = number of samples and D = flattened image (channels * rows * cols))

You can then compute the data covariance matrix by doing np.dot(X.T, X) which will be of shape [D x D], you can then perform SVD on this matrix and use the eigen basis for whitening your input which will also be of shape [D x D] (i.e. it is not constrained to be the same size as your batch / training set).

Also: don't forget to zero-center your data first.

For more reading you can check out the cs231n course section on whitening.

Note: the torchvision transforms operate on one image at a time (not on a batch), so you can whiten a single image using this whitening matrix and a mean vector in the transform and it will be applied to batches automatically by the data loader.

If you end up implementing this, please send a PR :)

alykhantejani on 16 Sep 2017

👍3

All 4 comments