Hi ML'ers,
I would like to know the most effective way to use AddReward and SetReward. I have tried several combinations. I have a task where the agent's actions aim to get it closer to a target. So, should I AddReward a small value for each action and then a final SetReward before calling Done()? Or, is it better to not use AddReward? Are there some basic guidelines?
Thanks you all - your help here was made this a lot of fun.
I think it depends on your problem, one approach could be to start by giving rewards as the agent gets closer to the goal to orient learning, this can cause some problems like the agent moving back and forth to collect rewards. In the PushBlock and Wall Jump environments, the reward is only given at the when the agent steps on the goal. You could imagine a scenario in which the target is initially very large and decreases in size as the agent gets better (curriculum learning).
I thought @billatarcat was asking how AddReward and SetReward are different.
SetReward establishes the reward amount for the current step, which gets added to the accumulated reward amount for the episode. If you call SetReward more than once during the same step, only the last call will have an effect.
AddReward can adjust the current step reward amount up or down, and just exists for convenience. Instead, you could compute the step reward in a local variable and apply it with a single call to SetReward at each step.
There are better ways to write this, but to illustrate how they work:
SetReward(0.3f); // normal step reward 0.3
if (gotBonus)
AddReward(0.1f); // adjust reward to 0.4
else if (gotPenalty)
AddReward(-0.1f); // adjust reward to 0.2
if (reachedGoal)
SetReward(1.0f); // reset step reward to 1.0
Thanks Vincent and Ellery. Great info. As an ML newb, I am looking for some basic approaches to tackling projects (ml-agents examples are a great spot to learn). I am having successes but there are a ton of nuances when setting up the code and which approaches to take. I feel like this system can tackle almost any complex project you throw at it as long as your break the task into learnable chunks for curriculum learning. Thanks, again.
Thanks for reaching out to us. Hopefully you were able to resolve your issue. We are closing this due to inactivity, but if you need additional assistance, feel free to reopen the issue.
We've updated our doc to explain this issue in more details. https://github.com/Unity-Technologies/ml-agents/pull/1996
This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.
Most helpful comment
I thought @billatarcat was asking how
AddRewardandSetRewardare different.SetRewardestablishes the reward amount for the current step, which gets added to the accumulated reward amount for the episode. If you callSetRewardmore than once during the same step, only the last call will have an effect.AddRewardcan adjust the current step reward amount up or down, and just exists for convenience. Instead, you could compute the step reward in a local variable and apply it with a single call toSetRewardat each step.There are better ways to write this, but to illustrate how they work: