Ml-agents: Unity ML Agents - AddReward and SetReward

Created on 15 Apr 2018 · 6Comments · Source: Unity-Technologies/ml-agents

Hi ML'ers,

I would like to know the most effective way to use AddReward and SetReward. I have tried several combinations. I have a task where the agent's actions aim to get it closer to a target. So, should I AddReward a small value for each action and then a final SetReward before calling Done()? Or, is it better to not use AddReward? Are there some basic guidelines?

Thanks you all - your help here was made this a lot of fun.

discussion

Source

billatarcat

Most helpful comment

I thought @billatarcat was asking how AddReward and SetReward are different.

SetReward establishes the reward amount for the current step, which gets added to the accumulated reward amount for the episode. If you call SetReward more than once during the same step, only the last call will have an effect.

AddReward can adjust the current step reward amount up or down, and just exists for convenience. Instead, you could compute the step reward in a local variable and apply it with a single call to SetReward at each step.

There are better ways to write this, but to illustrate how they work:

SetReward(0.3f);      // normal step reward 0.3
if (gotBonus)
    AddReward(0.1f);  //   adjust reward to 0.4
else if (gotPenalty)
    AddReward(-0.1f); //   adjust reward to 0.2
if (reachedGoal)
    SetReward(1.0f);  // reset step reward to 1.0

ellerychan on 17 Apr 2018

👍6

All 6 comments

I think it depends on your problem, one approach could be to start by giving rewards as the agent gets closer to the goal to orient learning, this can cause some problems like the agent moving back and forth to collect rewards. In the PushBlock and Wall Jump environments, the reward is only given at the when the agent steps on the goal. You could imagine a scenario in which the target is initially very large and decreases in size as the agent gets better (curriculum learning).

vincentpierre on 15 Apr 2018

👍1

I thought @billatarcat was asking how AddReward and SetReward are different.

There are better ways to write this, but to illustrate how they work:

SetReward(0.3f);      // normal step reward 0.3
if (gotBonus)
    AddReward(0.1f);  //   adjust reward to 0.4
else if (gotPenalty)
    AddReward(-0.1f); //   adjust reward to 0.2
if (reachedGoal)
    SetReward(1.0f);  // reset step reward to 1.0

ellerychan on 17 Apr 2018

👍6

Thanks Vincent and Ellery. Great info. As an ML newb, I am looking for some basic approaches to tackling projects (ml-agents examples are a great spot to learn). I am having successes but there are a ton of nuances when setting up the code and which approaches to take. I feel like this system can tackle almost any complex project you throw at it as long as your break the task into learnable chunks for curriculum learning. Thanks, again.

billatarcat on 17 Apr 2018

Thanks for reaching out to us. Hopefully you were able to resolve your issue. We are closing this due to inactivity, but if you need additional assistance, feel free to reopen the issue.