Hi,
It would be good to be able to get the total number of actions that can be performed in a gym.MultiDiscrete space, and also a mapping between an action index (between 0 and num_actions) and the tuple version of this action in the MultiDiscrete space.
I needed that for my own environment, and I made it but the solution is not the nicest way to do it. The best way would be to update the gym.Space class with a property num_actions returning this number and also a property actions_mapping returning the mapping.
@matthiasplappert Is there any fix for this problem yet? And @nerzadler have you found a getaway around it?
@AvisekNaug you can create your own environment wrapper class that does the work
@nerzadler, in fact, I did the same after giving it some thought. Since my multi discrete action space only needed 0s and 1s, and I had an actions space spanning 4 dimensions, I created a space.Discrete(16) action space and the env.step method in my environment had a wrapper mapping the discrete action from decimal to binary using some simple code. For example when the agent returned action number 4, i mapped it to an np.array [0,1,0,0]
I struggled with this for a while but then figured it out.
Let's say I have a environment env=gym.make('customenv-v0) with a Discrete action space of:
>>>env.action_space.n
returns an integer:
4
In case the action space used is MultiDiscrete, I used the following:
>>>env.action_space.nvec
which returned an nx1 array:
array([4, 4])
Furthermore, you can do
np.prod(env.observation_space.nvec)
which returns 16 which is the number of combinations for that space
Thanks for answering this @AbdulAlkurdi!
Getting error while implementing qlearning. Its always giving error while sending new state to qlearning formula. Already checked all the dimensions but still getting "index 252 is out of bounds for axis 0 with size 240"
Most helpful comment
I struggled with this for a while but then figured it out.
Let's say I have a environment
env=gym.make('customenv-v0) with a Discrete action space of:>>>env.action_space.nreturns an integer:
4In case the action space used is MultiDiscrete, I used the following:
>>>env.action_space.nvecwhich returned an nx1 array:
array([4, 4])Furthermore, you can do
np.prod(env.observation_space.nvec)which returns 16 which is the number of combinations for that space