Mmdetection: How to use Adam instead of SGD?

Created on 9 Dec 2019  路  5Comments  路  Source: open-mmlab/mmdetection

All the optimizers are defined as:
optimizer = dict(type='SGD', lr=2e-3, momentum=0.9, weight_decay=5e-4)
But I want to change it to Adam, how should I do ?
Can anyone give me an example?Thx!

Most helpful comment

@GYee

As far as I used Adam instead of SGD, expecting faster training speed, the loss diverged to larger than 1000 so I stopped the training. I haven't got a satisfactory training result, but at least training was stable using SGD.

All 5 comments

Hi @GYee ,
You can set the type='Adam' then set other parameters as usual.
The call is in this line, and you can see that any subclass of torch.optim and their parameters are applicable.

BTW, we did not try Adam before, so it might cause some bug when you try it.

@GYee

As far as I used Adam instead of SGD, expecting faster training speed, the loss diverged to larger than 1000 so I stopped the training. I haven't got a satisfactory training result, but at least training was stable using SGD.

@StringBottle +1

The loss diverged simply because the learning rate=0.02 is too large for Adam.
Try 1e-3 or 3e-4 and you will get reasonable results. However, the results are still much lower than that of using SGD (bbox mAP 31.1 and segm mAP 28.6 with 3e-4 on Mask R-CNN using ResNet-50). We suggest more hyper-parameter tuning when using Adam if Adam is inevitable in your experiments.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

liugaolian picture liugaolian  路  3Comments

fengxiuyaun picture fengxiuyaun  路  3Comments

hust-kevin picture hust-kevin  路  3Comments

yangcong955 picture yangcong955  路  3Comments

dereyly picture dereyly  路  3Comments