In DeepLabV3 codes, when an upsampling layer is used, the align_corners argument is set as False :
https://github.com/pytorch/vision/blob/d2c763e14efe57e4bf3ebf916ec243ce8ce3315c/torchvision/models/segmentation/_utils.py#L25
https://github.com/pytorch/vision/blob/d2c763e14efe57e4bf3ebf916ec243ce8ce3315c/torchvision/models/segmentation/_utils.py#L31
https://github.com/pytorch/vision/blob/d2c763e14efe57e4bf3ebf916ec243ce8ce3315c/torchvision/models/segmentation/deeplabv3.py#L62
But researchers in the semantic segmentation field usually set this as True based on a common understanding that this will bring performance improvement(however small).
Hi,
Using align_corners=False actually make the results of the interpolation match the results of OpenCV and TensorFlow, so I would expect that using align_corners=False would be expected.
Do you have any references of researchers setting it to True elsewhere?
Well, it's more like an insider trick that is used by lots of people to get better performance at corners, and it is becoming a norm.
e.g.
https://github.com/jfzhang95/pytorch-deeplab-xception/blob/9135e104a7a51ea9effa9c6676a2fcffe6a6a2e6/modeling/deeplab.py#L31
https://github.com/XiaLiPKU/EMANet/blob/b7d7ce0a46d4f17964a5551150742efd6ab6585b/dataset.py#L35
I wonder if this is done on purpose or is just an artifact that the error message, which reads
Default upsampling behavior when mode=bilinear is changed to align_corners=False since 0.4.0. Please specify align_corners=True if the old behavior is desired. See the documentation of nn.Upsample for details.
We would need to run some trainings with both align_corners=False and align_corners=True in order to see if there is any significant performance difference, before we consider making any changes.
I agree more trainings are necessary.
As I really don't have much resources myself, I only conducted training 3 times on each setup with mixed precision training and without COCO pretraining on this, and the average mIoU for align_corners=False & align_corners=True is 76.45% & 76.69% respectively, but it still could be just a coincidence since the training process is stochastic.
Looking forward for your training results.
Thanks for the numbers you reported! A 0.2% mIoU difference is very probably just within noise, so I wouldn't expect it to really make that much of a difference. But as you said, I should still run a few trainings just to make sure
Hi @voldemortX
@fmassa and I conducted the experiments regarding this.
We trained 3 models each for align_corners=True and align_corners=False, and we did not see difference.
We will keep the value for align_corners=False and I am closing the issue.
align_corners | True | True聽| True | | False | False | False
-- | -- | -- | -- | -- | -- | -- | --
Global Correct | 92.5 | 92.5 | 92.8 | 聽 | 92.5 | 92.6 | 92.4
Mean IoU | 67.6 | 67.3 | 67.8 | 聽 | 67.6 | 67.8 | 67.2
Patch for
align_corners=True
diff --git a/torchvision/models/segmentation/_utils.py b/torchvision/models/segmentation/_utils.py
index c5a7ae9..705ecee 100644
--- a/torchvision/models/segmentation/_utils.py
+++ b/torchvision/models/segmentation/_utils.py
@@ -22,13 +22,13 @@ class _SimpleSegmentationModel(nn.Module):
result = OrderedDict()
x = features["out"]
x = self.classifier(x)
- x = F.interpolate(x, size=input_shape, mode='bilinear', align_corners=False)
+ x = F.interpolate(x, size=input_shape, mode='bilinear', align_corners=True)
result["out"] = x
if self.aux_classifier is not None:
x = features["aux"]
x = self.aux_classifier(x)
- x = F.interpolate(x, size=input_shape, mode='bilinear', align_corners=False)
+ x = F.interpolate(x, size=input_shape, mode='bilinear', align_corners=True)
result["aux"] = x
return result
diff --git a/torchvision/models/segmentation/deeplabv3.py b/torchvision/models/segmentation/deeplabv3.py
index ae652cd..9a646d9 100644
--- a/torchvision/models/segmentation/deeplabv3.py
+++ b/torchvision/models/segmentation/deeplabv3.py
@@ -59,7 +59,7 @@ class ASPPPooling(nn.Sequential):
size = x.shape[-2:]
for mod in self:
x = mod(x)
- return F.interpolate(x, size=size, mode='bilinear', align_corners=False)
+ return F.interpolate(x, size=size, mode='bilinear', align_corners=True)
class ASPP(nn.Module):
python -m torch.distributed.launch --nproc_per_node=8 --use_env train.py --lr 0.02 --dataset coco -b 4 --model deeplabv3_resnet101 --aux-loss
Thanks for the investigation @mthrok !
Most helpful comment
Hi @voldemortX
@fmassa and I conducted the experiments regarding this.
We trained 3 models each for
align_corners=Trueandalign_corners=False, and we did not see difference.We will keep the value for
align_corners=Falseand I am closing the issue.Result
align_corners | True | True聽| True | | False | False | False
-- | -- | -- | -- | -- | -- | -- | --
Global Correct | 92.5 | 92.5 | 92.8 | 聽 | 92.5 | 92.6 | 92.4
Mean IoU | 67.6 | 67.3 | 67.8 | 聽 | 67.6 | 67.8 | 67.2
Setup
Patch for
align_corners=TrueCommand