Ml-agents: Best RL rewarding strategy for racing game?

Created on 11 Mar 2019 · 6Comments · Source: Unity-Technologies/ml-agents

Hello, i am working on simple racing game project.
For now, i manage to achieve some results based on rewards system that is taking into consideration position, speed and collision. If agent agent is not hitting any obstacles and move forward during step i rewarded him basing on his speed. I punish agent when he is colliding with walls, obstacles or he is off the race track.

But this method is far from ideal and will not encourage agent to minimise lap times (agent will focus on keeping just high speed and stopping and hitting obstacles).

I would love to acess the agent basing on lap times, but this would mean that i need to set reward only after finnishing lap and will probably not give enough feedback information on start of the training and during the lap.

Do You have any suggestion on how to implement reinforcement learning rewards for racing type of game?

discussion

Source

oseq

Most helpful comment

You could implement checkpoints, which reward the agent relatively to the taken time.

MarcoMeter on 12 Mar 2019

👍3

All 6 comments

You could implement checkpoints, which reward the agent relatively to the taken time.

MarcoMeter on 12 Mar 2019

👍3

@MarcoMeter suggests, you can implement a time-based reward. A popular technique we use in our example environments is to assign a small negative reward at each timestep, so the agent is incentivized to finish as quickly as possible.

ervteng on 13 Mar 2019

@MarcoMeter checkpoints are good idea. I can reward agent when he reach next checkpoint (+1) and assign small negative reward in meantime (in way to checkpoint) (i.e. -0.001 each step).

oseq on 13 Mar 2019

Thank you for the discussion. We are closing this issue due to inactivity. Feel free to reopen it if you’d like to continue the discussion though.