Ml-agents: Positive mean rewards

Created on 8 May 2019 · 4Comments · Source: Unity-Technologies/ml-agents

I was training my models and sometimes I get positive mean rewards. Is there something wrong in my configs if I get this? What does it mean when I get positive mean rewards?

discussion help-wanted

Source

MrGitGo

Most helpful comment

The mean reward refers to the _"Cumulative Reward - The mean cumulative episode reward over all agents. Should increase during a successful training session._"
While the Std. of reward just describes the standard deviation of the reward which can also be seen as a margin of error. I would recommend to check out some more example environments to understand better how rewards work.

ScriptBono on 9 May 2019

👍3

All 4 comments

Maybe you have to clarify your problem a little bit more. In general that is absolutely how it is supposed to be. During training your mean rewards should slowly increase until they get close to the potential maximum. It would only be strange if you have defined no positive rewards but only negative rewards. Then there shouldn't be a mean positive reward.

ScriptBono on 8 May 2019

Hmm ok, I think I read the rewards the whole time wrong. Could you say me how to understand it? If I have Mean Reward: -0.489 Std. of reward: 0.268

What are these two numbers?

MrGitGo on 9 May 2019

ScriptBono on 9 May 2019

👍3

Thank you for the discussion. We are closing this issue due to inactivity. Feel free to reopen it if you’d like to continue the discussion though.