Vision: Add RandomGaussianBlur

Created on 1 Sep 2020 · 11Comments · Source: pytorch/vision

🚀 Feature

Idea is to add a random gaussian blur image transform like in SwAV

cc @vfdev-5

transforms

Source

vfdev-5

Most helpful comment

Glad to see someone is working on this! It would be better for me if the function is to be as controllable as that of OpenCV.
I'm currently doing some research work on image self-supervised learning (SSL), where people usually perform Gaussian blur with fixed kernel_size while random sigma, for example, SimCLR and BYOL.
So it would be more helpful if we can control both the kernel_size and sigma, instead of radius only. Here is a typical usage of Gaussian blur in SSL:

class GaussianBlur():
    def __init__(self, kernel_size, sigma_min=0.1, sigma_max=2.0):
        self.sigma_min = sigma_min
        self.sigma_max = sigma_max
        self.kernel_size = kernel_size

    def __call__(self, img):
        sigma = np.random.uniform(self.sigma_min, self.sigma_max)
        img = cv2.GaussianBlur(np.array(img), (self.kernel_size, self.kernel_size), sigma)
        return Image.fromarray(img.astype(np.uint8))

transform = transforms.Compose([
    ...,
    transforms.RandomApply([GaussianBlur(kernel_size=23)], p=0.1),
    ...,
])

In summary, I hope we can provide:

function F.gaussian_blur takes kernel_size and sigma as inputs, just like cv2.GaussianBlur
class transforms.GaussianBlur that accepts float or tuple of float (min, max) inputs just like albumentations and transforms.ColorJitter

yaox12 on 3 Sep 2020

👍3

All 11 comments

Hi @vfdev-5, first-time contributor here. Can I take this up, if it is okay with you?

I see that kornia already has this transform (link). Is it recommended to design this transform by using kornia's utility or better to have those utilities natively in torchvision?

tejank10 on 1 Sep 2020

Hi @tejank10 thanks for asking and proposing to work on this issue.

Is it recommended to design this transform by using kornia's utility or better to have those utilities natively in torchvision?

In general it is better to avoid adding new dependencies (like kornia) for a single feature. This feature should be implemented for PIL and torch.tensor inputs.

first-time contributor here. Can I take this up, if it is okay with you?

Technically, this feature can be a bit complex for a first-time contributors but I can help/guide and directly iterate on sent PR etc.

There will be two implementations, see as example F.rgb_to_grayscale.
For torch.tensor input we have to use pytorch F.conv2d with gaussian kernel and for PIL we can use GaussianBlur.

What do you think ?

vfdev-5 on 1 Sep 2020

Thanks for the direction @vfdev-5.

I was comparing PIL's GaussianBlur functionality with the ones from other libraries. This will help in designing the transform for torch.Tensor and mitigate the possible inconsistencies between PIL vs tensor outputs.

PIL's GaussianBlur performs BoxFilter thrice to approximate GaussianBlur. It takes sigma as input. If we compare it with OpenCV's GaussianBlur, it are more controllable. We can control the kernel size and padding mode as well there. It allows not to specify sigma and falls back to a value that is computed using kernel size.
Even in terms of the output, there are differences if we compare PIL's GaussianBlur with that of OpenCV.

While designing the blur for torch.Tensor should we go about it the OpenCV way, or try to match PIL since we plan to use PIL's version when PIL Image as input?

tejank10 on 2 Sep 2020

@tejank10 for the first version I think PIL's GaussianBlur with a single parameter radius is OK. We can not use opencv as dependency neither.

While designing the blur for torch.Tensor should we go about it the OpenCV way, or try to match PIL since we plan to use PIL's version when PIL Image as input?

Please, take a look how implemented F.rgb_to_grayscale. Idea is to be able to dispatch according to the input type: if input is PIL image => F_pil.gaussian_blur, if input is torch tensor => F_t.gaussian_blur. F_pil.gaussian_blur should perform PIL's GaussianBlur and F_t.gaussian_blur should work directly on tensor without using any other library: a) create gaussian kernel tensor as it is done in PIL code and apply pytorch's F.cond2d...

vfdev-5 on 2 Sep 2020

@vfdev-5 Yeah I understand that using any other library is not the way. :)
I was trying to contrast the way Gaussian blur actually happens between PIL and OpenCV and wanted to make sure whether we want our Gaussian kernel creation function to be as customizable as that of OpenCV, or are we okay with that tradeoff and let the function perform the same as PIL.

tejank10 on 2 Sep 2020

@tejank10 yes we are OK with that tradeoff and let the function perform the same as PIL.

vfdev-5 on 2 Sep 2020

👍1

class GaussianBlur():
    def __init__(self, kernel_size, sigma_min=0.1, sigma_max=2.0):
        self.sigma_min = sigma_min
        self.sigma_max = sigma_max
        self.kernel_size = kernel_size

    def __call__(self, img):
        sigma = np.random.uniform(self.sigma_min, self.sigma_max)
        img = cv2.GaussianBlur(np.array(img), (self.kernel_size, self.kernel_size), sigma)
        return Image.fromarray(img.astype(np.uint8))

transform = transforms.Compose([
    ...,
    transforms.RandomApply([GaussianBlur(kernel_size=23)], p=0.1),
    ...,
])

In summary, I hope we can provide:

function F.gaussian_blur takes kernel_size and sigma as inputs, just like cv2.GaussianBlur
class transforms.GaussianBlur that accepts float or tuple of float (min, max) inputs just like albumentations and transforms.ColorJitter

yaox12 on 3 Sep 2020

👍3

@yaox12 thanks for the suggestion, let's see what could we do.

vfdev-5 on 3 Sep 2020

This is how it currently looks. It randomly chooses a kernel radius from the given range, creates kernel accordingly, and applies the blurring function.

Following is the error image between PIL vs Tensor versions when I tried it on the hopper image:
erimg