Hi, so I want to tune my hyperparameter for the PPO Algorithm but I've found difficulties when reading the docs about the configs, so I guess I want to ask you guys in here about:
lr_schedule in the PPO Algorithm? Suppose that my starting learning_rate is 'lr': 1e-4 and I want to decay its value to 0 when I train.Thank you very much guys! I really appreciate your help 馃槃
Yeah, sorry, it's not clearly documented. Here are the answers. We'll add this to the docs.
1) You are basically configuring a PiecewiseSchedule.
So lr_schedule: [[0, 0.01], [1000, 0.0005]] means that you decay from ts=0 (lr=0.01) linearly to ts=1000 (lr=0.0005). After 1000ts your learning rate will stay at 0.0005. The config key "lr" is ignored in this setting.
2) You can do e.g. config["model"]["fcnet_hiddens"] = [16, 32, 64]. Change the activation by using config["model"]["fcnet_activation"] ("tanh", "relu", or "linear").
Thank you so
Yeah, sorry, it's not clearly documented. Here are the answers. We'll add this to the docs.
- You are basically configuring a PiecewiseSchedule.
So lr_schedule: [[0, 0.01], [1000, 0.0005]] means that you decay from ts=0 (lr=0.01) linearly to ts=1000 (lr=0.0005). After 1000ts your learning rate will stay at 0.0005. The config key "lr" is ignored in this setting.- You can do e.g. config["model"]["fcnet_hiddens"] = [16, 32, 64]. Change the activation by using config["model"]["fcnet_activation"] ("tanh", "relu", or "linear").
Thank you so much for your help!!! It helps me a lot for my project 馃槃
Most helpful comment
Yeah, sorry, it's not clearly documented. Here are the answers. We'll add this to the docs.
1) You are basically configuring a PiecewiseSchedule.
So lr_schedule: [[0, 0.01], [1000, 0.0005]] means that you decay from ts=0 (lr=0.01) linearly to ts=1000 (lr=0.0005). After 1000ts your learning rate will stay at 0.0005. The config key "lr" is ignored in this setting.
2) You can do e.g. config["model"]["fcnet_hiddens"] = [16, 32, 64]. Change the activation by using config["model"]["fcnet_activation"] ("tanh", "relu", or "linear").