The output fo the DQN model is not within the action space.
Something is wrong when constructing the torch model when dueling is off. The output dimension of the model is equal to whatever is passed in "fcnet_hiddens" instead of being of the size of the action space.
Ray version and other system information (Python version, TensorFlow version, OS):
import ray
from ray import tune
ray.init()
config = {
"env": "CartPole-v1",
"num_workers": 1,
"train_batch_size": 128,
"learning_starts": 128,
"model": {"fcnet_hiddens": [32]},
"dueling": False ,
"framework": "torch"
}
tune.run("DQN", name="MWE", config=config, stop={"training_iteration": 100})
Can you just change the following in your rllib/agents/dqn/dqn_torch_model.py (c'tor) ?
advantage_module = nn.Sequential()
value_module = nn.Sequential()
# Dueling case: Build the shared (advantages and value) fc-network.
if self.dueling:
for i, n in enumerate(q_hiddens):
advantage_module.add_module("dueling_A_{}".format(i),
nn.Linear(ins, n))
value_module.add_module("dueling_V_{}".format(i),
nn.Linear(ins, n))
# Add activations if necessary.
if dueling_activation == "relu":
advantage_module.add_module("dueling_A_act_{}".format(i),
nn.ReLU())
value_module.add_module("dueling_V_act_{}".format(i),
nn.ReLU())
elif dueling_activation == "tanh":
advantage_module.add_module("dueling_A_act_{}".format(i),
nn.Tanh())
value_module.add_module("dueling_V_act_{}".format(i),
nn.Tanh())
# Add LayerNorm after each Dense.
if add_layer_norm:
advantage_module.add_module("LayerNorm_A_{}".format(i),
nn.LayerNorm(n))
value_module.add_module("LayerNorm_V_{}".format(i),
nn.LayerNorm(n))
ins = n
# Actual Advantages layer (nodes=num-actions) and
# value layer (nodes=1).
advantage_module.add_module("A", nn.Linear(ins, action_space.n))
value_module.add_module("V", nn.Linear(ins, 1))
That should fix it. Will PR now ...
@MaximeBouton
Just saw this, I can give it a try tomorrow morning
This PR fixes the issue: https://github.com/ray-project/ray/pull/9386
Will be merged today into master. Thanks for filing this!
Closing it now. Please feel free to re-open should this still not work on your end.
I installed the nightly version and it works, thanks for the quick fix!
This has been merged into master.
Awesome! Glad it's working. :)