Mmdetection: How to use Adam instead of SGD?

Created on 9 Dec 2019 · 5Comments · Source: open-mmlab/mmdetection

All the optimizers are defined as:
optimizer = dict(type='SGD', lr=2e-3, momentum=0.9, weight_decay=5e-4)
But I want to change it to Adam, how should I do ?
Can anyone give me an example?Thx!

Source

GYee

Most helpful comment

@GYee

As far as I used Adam instead of SGD, expecting faster training speed, the loss diverged to larger than 1000 so I stopped the training. I haven't got a satisfactory training result, but at least training was stable using SGD.

StringBottle on 17 Dec 2019

👀2 👍2

All 5 comments

Hi @GYee ,
You can set the type='Adam' then set other parameters as usual.
The call is in this line, and you can see that any subclass of torch.optim and their parameters are applicable.

ZwwWayne on 9 Dec 2019

BTW, we did not try Adam before, so it might cause some bug when you try it.

ZwwWayne on 9 Dec 2019

@GYee

StringBottle on 17 Dec 2019

👀2 👍2

@StringBottle +1

Hemantr05 on 15 Mar 2020

The loss diverged simply because the learning rate=0.02 is too large for Adam.
Try 1e-3 or 3e-4 and you will get reasonable results. However, the results are still much lower than that of using SGD (bbox mAP 31.1 and segm mAP 28.6 with 3e-4 on Mask R-CNN using ResNet-50). We suggest more hyper-parameter tuning when using Adam if Adam is inevitable in your experiments.

ZwwWayne on 16 Mar 2020

Was this page helpful?

0 / 5 - 0 ratings