Ml-agents: What does "Std of Reward" mean?

Created on 31 Mar 2018  路  4Comments  路  Source: Unity-Technologies/ml-agents

I searched the docs but I didn't find any definition for the meaning of "Std of Reward" in the console output.

help-wanted

Most helpful comment

@jlanis Thanks for bringing this up. We do have plans to support more automated parallel training. Both from the perspective of allowing multiple unity processes to run at once, as well as easier tools for duplicating training areas within a scene.

All 4 comments

Hi @jlanis,

In this case Std corresponds to the standard deviation of the reward. It is a measure of the spread around the mean reward. A large value would indicate a lot of variation in rewards received, and a small value would indicate the opposite.

@awjuliani Ah, thanks!

One other quick question I wanted to ask even though it's unrelated - I noticed that in the Balance Ball example, you had multiple instances of the agents running at the same time (connected to a single brain) to help speed up the learning process. Are there any plans to have this kind of setup unnecessary in the future - (i.e, instead of manually copy & pasting multiple instances in the Unity editor which seems a little sloppy, would it be possible to run multiple training sessions (or instances) that all feed into a single brain automatically?)

@jlanis Thanks for bringing this up. We do have plans to support more automated parallel training. Both from the perspective of allowing multiple unity processes to run at once, as well as easier tools for duplicating training areas within a scene.

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

Porigon45 picture Porigon45  路  3Comments

Procuste34 picture Procuste34  路  3Comments

green4you picture green4you  路  4Comments

RavenLeeANU picture RavenLeeANU  路  4Comments

MrGitGo picture MrGitGo  路  4Comments