Ml-agents: What does "Std of Reward" mean?

Created on 31 Mar 2018 · 4Comments · Source: Unity-Technologies/ml-agents

I searched the docs but I didn't find any definition for the meaning of "Std of Reward" in the console output.

help-wanted

Source

jlanis

Most helpful comment

@jlanis Thanks for bringing this up. We do have plans to support more automated parallel training. Both from the perspective of allowing multiple unity processes to run at once, as well as easier tools for duplicating training areas within a scene.

awjuliani on 1 Apr 2018

👍6

All 4 comments

Hi @jlanis,

In this case Std corresponds to the standard deviation of the reward. It is a measure of the spread around the mean reward. A large value would indicate a lot of variation in rewards received, and a small value would indicate the opposite.

awjuliani on 31 Mar 2018

👍5

@awjuliani Ah, thanks!

One other quick question I wanted to ask even though it's unrelated - I noticed that in the Balance Ball example, you had multiple instances of the agents running at the same time (connected to a single brain) to help speed up the learning process. Are there any plans to have this kind of setup unnecessary in the future - (i.e, instead of manually copy & pasting multiple instances in the Unity editor which seems a little sloppy, would it be possible to run multiple training sessions (or instances) that all feed into a single brain automatically?)

jlanis on 31 Mar 2018

👍5

awjuliani on 1 Apr 2018

👍6

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.