Stable-baselines: Regularization in baselines

Created on 18 Aug 2020 · 8Comments · Source: hill-a/stable-baselines

How can I do regularization (such as l1/l2, drop outs) in baselines?

duplicate question

Source

sophiagu

👍1

Most helpful comment

A good reference for this is the coinrun repository (https://github.com/openai/coinrun). Maybe it would be something easy and with little side-impacts to introduce in the master code.

https://github.com/openai/coinrun/blob/523704f3a203dcaad84caf96ea92799452dc902f/coinrun/ppo2.py#L105

huvar on 19 Aug 2020

👍2

All 8 comments

Duplicate of #817 and #403.

See docs on custom policies. You may need to modify loss functions for L1/L2/"weight decay" regularization, and that has to be done manually to the algorithm's code.

Miffyli on 18 Aug 2020

Thx Miffyli! Could you point me to the code location where I can modify to add regularization? I'm using the MlpLstmPolicy specifically.

sophiagu on 19 Aug 2020

Something like this could do the trick, which you then add to the loss. Losses are computed in the algorithm, e.g. PPO2 here. You may close the issue if there are no other bugs/issues to raise related to stable-baselines.

Miffyli on 19 Aug 2020

👍1

A good reference for this is the coinrun repository (https://github.com/openai/coinrun). Maybe it would be something easy and with little side-impacts to introduce in the master code.

https://github.com/openai/coinrun/blob/523704f3a203dcaad84caf96ea92799452dc902f/coinrun/ppo2.py#L105

huvar on 19 Aug 2020

👍2

One thing I forgot to mention: This is much easier in PyTorch version of stable-baselines, where you can add L2 regularization via the weight_decay parameter to optimizers. Note to self: We should probably expose this there.

Miffyli on 19 Aug 2020

Thx! One question: is stable-baselines moving to stable-baselines3 or is stable-baselines3 just a PyTorch version of this repo?

sophiagu on 19 Aug 2020

Our main focus is now on stable-baselines3 and we plan to mostly include bug fixes and small adjustments to this library. This one will continue to exist though and we do not intent to abandon it completely, at least not until underlying libraries break (i.e. support for TF1.x ends).

Miffyli on 19 Aug 2020

Got it!

sophiagu on 19 Aug 2020

Was this page helpful?

0 / 5 - 0 ratings