Vision: [Question] Isn't setting align_corners=True a more popular choice in segmentation models?

Created on 30 Dec 2019  路  7Comments  路  Source: pytorch/vision

In DeepLabV3 codes, when an upsampling layer is used, the align_corners argument is set as False :
https://github.com/pytorch/vision/blob/d2c763e14efe57e4bf3ebf916ec243ce8ce3315c/torchvision/models/segmentation/_utils.py#L25
https://github.com/pytorch/vision/blob/d2c763e14efe57e4bf3ebf916ec243ce8ce3315c/torchvision/models/segmentation/_utils.py#L31
https://github.com/pytorch/vision/blob/d2c763e14efe57e4bf3ebf916ec243ce8ce3315c/torchvision/models/segmentation/deeplabv3.py#L62

But researchers in the semantic segmentation field usually set this as True based on a common understanding that this will bring performance improvement(however small).

awaiting response models needs discussion semantic segmentation

Most helpful comment

Hi @voldemortX

@fmassa and I conducted the experiments regarding this.
We trained 3 models each for align_corners=True and align_corners=False, and we did not see difference.
We will keep the value for align_corners=False and I am closing the issue.

Result

align_corners | True | True聽| True | | False | False | False
-- | -- | -- | -- | -- | -- | -- | --
Global Correct | 92.5 | 92.5 | 92.8 | 聽 | 92.5 | 92.6 | 92.4
Mean IoU | 67.6 | 67.3 | 67.8 | 聽 | 67.6 | 67.8 | 67.2

Setup

  • PyTorch 1.4.0 (via conda with cudatoolkit=10.1)
  • Torchvision v0.5.0 (Revision https://github.com/pytorch/vision/commit/85b8fbfd31e9324e64e24ca25410284ef238bcb3)
  • Model: Deeplab v3 ResNet 101
  • 30 Epochs

Patch for align_corners=True

diff --git a/torchvision/models/segmentation/_utils.py b/torchvision/models/segmentation/_utils.py
index c5a7ae9..705ecee 100644
--- a/torchvision/models/segmentation/_utils.py
+++ b/torchvision/models/segmentation/_utils.py
@@ -22,13 +22,13 @@ class _SimpleSegmentationModel(nn.Module):
         result = OrderedDict()
         x = features["out"]
         x = self.classifier(x)
-        x = F.interpolate(x, size=input_shape, mode='bilinear', align_corners=False)
+        x = F.interpolate(x, size=input_shape, mode='bilinear', align_corners=True)
         result["out"] = x

         if self.aux_classifier is not None:
             x = features["aux"]
             x = self.aux_classifier(x)
-            x = F.interpolate(x, size=input_shape, mode='bilinear', align_corners=False)
+            x = F.interpolate(x, size=input_shape, mode='bilinear', align_corners=True)
             result["aux"] = x

         return result
diff --git a/torchvision/models/segmentation/deeplabv3.py b/torchvision/models/segmentation/deeplabv3.py
index ae652cd..9a646d9 100644
--- a/torchvision/models/segmentation/deeplabv3.py
+++ b/torchvision/models/segmentation/deeplabv3.py
@@ -59,7 +59,7 @@ class ASPPPooling(nn.Sequential):
         size = x.shape[-2:]
         for mod in self:
             x = mod(x)
-        return F.interpolate(x, size=size, mode='bilinear', align_corners=False)
+        return F.interpolate(x, size=size, mode='bilinear', align_corners=True)


 class ASPP(nn.Module):

Command

python -m torch.distributed.launch --nproc_per_node=8 --use_env train.py  --lr 0.02 --dataset coco -b 4 --model deeplabv3_resnet101 --aux-loss

All 7 comments

Hi,

Using align_corners=False actually make the results of the interpolation match the results of OpenCV and TensorFlow, so I would expect that using align_corners=False would be expected.

Do you have any references of researchers setting it to True elsewhere?

I wonder if this is done on purpose or is just an artifact that the error message, which reads

Default upsampling behavior when mode=bilinear is changed to align_corners=False since 0.4.0. Please specify align_corners=True if the old behavior is desired. See the documentation of nn.Upsample for details.

We would need to run some trainings with both align_corners=False and align_corners=True in order to see if there is any significant performance difference, before we consider making any changes.

I agree more trainings are necessary.
As I really don't have much resources myself, I only conducted training 3 times on each setup with mixed precision training and without COCO pretraining on this, and the average mIoU for align_corners=False & align_corners=True is 76.45% & 76.69% respectively, but it still could be just a coincidence since the training process is stochastic.
Looking forward for your training results.

Thanks for the numbers you reported! A 0.2% mIoU difference is very probably just within noise, so I wouldn't expect it to really make that much of a difference. But as you said, I should still run a few trainings just to make sure

Hi @voldemortX

@fmassa and I conducted the experiments regarding this.
We trained 3 models each for align_corners=True and align_corners=False, and we did not see difference.
We will keep the value for align_corners=False and I am closing the issue.

Result

align_corners | True | True聽| True | | False | False | False
-- | -- | -- | -- | -- | -- | -- | --
Global Correct | 92.5 | 92.5 | 92.8 | 聽 | 92.5 | 92.6 | 92.4
Mean IoU | 67.6 | 67.3 | 67.8 | 聽 | 67.6 | 67.8 | 67.2

Setup

  • PyTorch 1.4.0 (via conda with cudatoolkit=10.1)
  • Torchvision v0.5.0 (Revision https://github.com/pytorch/vision/commit/85b8fbfd31e9324e64e24ca25410284ef238bcb3)
  • Model: Deeplab v3 ResNet 101
  • 30 Epochs

Patch for align_corners=True

diff --git a/torchvision/models/segmentation/_utils.py b/torchvision/models/segmentation/_utils.py
index c5a7ae9..705ecee 100644
--- a/torchvision/models/segmentation/_utils.py
+++ b/torchvision/models/segmentation/_utils.py
@@ -22,13 +22,13 @@ class _SimpleSegmentationModel(nn.Module):
         result = OrderedDict()
         x = features["out"]
         x = self.classifier(x)
-        x = F.interpolate(x, size=input_shape, mode='bilinear', align_corners=False)
+        x = F.interpolate(x, size=input_shape, mode='bilinear', align_corners=True)
         result["out"] = x

         if self.aux_classifier is not None:
             x = features["aux"]
             x = self.aux_classifier(x)
-            x = F.interpolate(x, size=input_shape, mode='bilinear', align_corners=False)
+            x = F.interpolate(x, size=input_shape, mode='bilinear', align_corners=True)
             result["aux"] = x

         return result
diff --git a/torchvision/models/segmentation/deeplabv3.py b/torchvision/models/segmentation/deeplabv3.py
index ae652cd..9a646d9 100644
--- a/torchvision/models/segmentation/deeplabv3.py
+++ b/torchvision/models/segmentation/deeplabv3.py
@@ -59,7 +59,7 @@ class ASPPPooling(nn.Sequential):
         size = x.shape[-2:]
         for mod in self:
             x = mod(x)
-        return F.interpolate(x, size=size, mode='bilinear', align_corners=False)
+        return F.interpolate(x, size=size, mode='bilinear', align_corners=True)


 class ASPP(nn.Module):

Command

python -m torch.distributed.launch --nproc_per_node=8 --use_env train.py  --lr 0.02 --dataset coco -b 4 --model deeplabv3_resnet101 --aux-loss

Thanks for the investigation @mthrok !

Was this page helpful?
0 / 5 - 0 ratings

Related issues

carlocab picture carlocab  路  3Comments

300LiterPropofol picture 300LiterPropofol  路  3Comments

a-maci picture a-maci  路  3Comments

IssamLaradji picture IssamLaradji  路  3Comments

datumbox picture datumbox  路  3Comments