Vision: Why does the rpn use the L1_Loss?

Created on 1 Dec 2019  路  2Comments  路  Source: pytorch/vision

https://github.com/pytorch/vision/blob/master/torchvision/models/detection/rpn.py#L426

the code in the rpn.py , line 426 as follows:

box_loss = F.l1_loss(
pred_bbox_deltas[sampled_pos_inds],
regression_targets[sampled_pos_inds],
reduction="sum",
) / (sampled_inds.numel())

However, as said in the paper of Faster RCNN, the loss funtion used in the rpn training stage is smooth_L1_LOSS.

and I found that when computing the rcnn_box_loss, the loss function used in the torchvsion is Smooth_L1_Loss,:
https://github.com/pytorch/vision/blob/master/torchvision/models/detection/roi_heads.py#L47

Why not use the Smooth_L1_LOSS in both places ?

models question object detection

All 2 comments

Hi,

Great question. It turns out that using smooth_l1_loss actually gave slightly worse results for COCO for detection and instance segmentation, and has also been replaced by l1_loss in detectron2, see https://github.com/facebookresearch/detectron2/blob/61c15d20f804c0393d412fb3467649ea8e50fb57/detectron2/config/defaults.py#L212 and https://github.com/facebookresearch/detectron2/blob/61c15d20f804c0393d412fb3467649ea8e50fb57/detectron2/config/defaults.py#L280 .

For that reason (and also for simplicity because PyTorch smooth_l1_loss doesn't have the beta argument, and Mask R-CNN already has a ton of arguments), we decided to simply use the l1_loss in torchvision.

Let me know what you think.

I'm closing the issue, but let me know if you have further questions.

Hi,

Great question. It turns out that using smooth_l1_loss actually gave slightly worse results for COCO for detection and instance segmentation, and has also been replaced by l1_loss in detectron2, see https://github.com/facebookresearch/detectron2/blob/61c15d20f804c0393d412fb3467649ea8e50fb57/detectron2/config/defaults.py#L212 and https://github.com/facebookresearch/detectron2/blob/61c15d20f804c0393d412fb3467649ea8e50fb57/detectron2/config/defaults.py#L280 .

For that reason (and also for simplicity because PyTorch smooth_l1_loss doesn't have the beta argument, and Mask R-CNN already has a ton of arguments), we decided to simply use the l1_loss in torchvision.

Let me know what you think.

I'm closing the issue, but let me know if you have further questions.

Thank you for your reply!
I also found that the smooth_l1_loss could cause slightly worse results and I just want to confirm my idea.
Thank you very much!

Was this page helpful?
0 / 5 - 0 ratings