Stable-baselines: What is the default network architecture for MlpLnLstmPolicy?

Created on 5 Aug 2019 · 3Comments · Source: hill-a/stable-baselines

I'm trying to create a custom policy network, but I can't find the default architecture for an MlpLnLstmPolicy to benchmark against. The only thing I see is that MlpLnLstm has a shared LSTM network of default size 256, but I don't know the sizes and number of layers within the Value and Policy networks.

It would also be good to know if there are any activations or dropouts between layers (if applicable).

Thank you

documentation question

Source

ktattan

Most helpful comment

Indeed it is not apparent directly what is default architecture, but you can find the essentials in LstmPolicy's init:

Two layers of 64 units with tanh-activations, followed by the LSTM layer of 256 units. This is then split into value and policy functions. If you use CNN version, it use the network from Nature DQN paper (code here) with ReLU activations.