Hi, I recently used ml-agents to build a flappy bird agent and trained at x20 speed. The agent successfully converged on a policy and was able to achieve very high scores (1000+). When I ran the agent at realtime speed for inference it was only able to hit a score of ~5 and performed very poorly. When I put the inference speed back up to x20, the agent was back performing well.
Is there something that I can do to ensure that the agent is learning a policy relative to real time speed while training during inference as opposed to depending on the speed up version during training? (while still running at a speed up rate)
Any help would be appreciated.
Thanks
ml-agents = 0.8.1
unity = 2019.1.0f2
Hi, this error usually happens when the physics / logic of your game is timeScale dependent. If your game is not the same depending on the timeScale, then the agent will perform differently at training and inference. Usually, this happens because of a missing Time.deltaTime or Time.fixedDeltaTime.
That makes sense thanks! I am currently just using the standard 2d physics provided by unity. Is there anything special I need to do?
I am not familiar with the standard 2d physics and how much you need to code to make it work. You need to be careful when calling Physics methods and make sure that you scale values by delta time appropriately.
Another option would be to train with a time scale of 1 and use multiple environments at the same time.
Yes for example you can have a bullet which will not collide with the player if the game is running very fast. The collision is not properly detected at high speed.
So the game could achieve a very high score because the collisions are undetected.
Most helpful comment
Yes for example you can have a bullet which will not collide with the player if the game is running very fast. The collision is not properly detected at high speed.
So the game could achieve a very high score because the collisions are undetected.