Hi - My agent has to keep track of a ball on a playing field. I'm normalizing the ball's position to x/z coordinates between -1 and +1. Depending on the size of the field, small movements might cause tiny changes in the observed values, on the order of 0.0001 step increments. Which seems to make learning precise behaviour quite difficult. What's a good practice to deal with this issue? I guess with regard to coordinates, one could always partition an area into subgrids. But I'm wondering if there's a more general approach to handling very small observation changes. Some strategy for making agents more sensitive to them. Thanks!
if you really wish,
you can normalize the position.
so -100, -100 would become something like -0.7, -0.7
and then send the distance,
the original magnitude to something like 0-1.
Second way i can think is:
lets assume the position of the ball is (423, 0) so magnitude is 423.
Again we send the position normalized.
and then we just pass each position digit separately.
so first 423 is 400 + 20 + 3
so we send the 400, 20 , 3 separately.
this way the agent, can know any tiny change in out position.
is that really necessary? idk, but that is the best way i can think of.
let say one vector stores the distance of the first number : 3 / 10 (or / 9)
second input : 2 / 10
third one : 4 / 10
so now instead of sending the ml agent
423, 0 / 1000
or something
you send it
1 , 0 , 0.3 , 0.2, 0.4
so any tiny change in you'r ball position would directly be sent to the agent.
i think that might be the best way.
Hi @mbaske,
As mentioned in the issue template we don't have the resources to help debug issues with custom environments.
In general, I would say that if your model is that sensitive to the position, it's going to have trouble learning.
Some general suggestions would be to use the position of the ball relative to the agent (for example agent.position - ball.position) instead of the ball's absolute position, or consider using the velocity of the ball as observations. Hopefully these will make the model less sensitive to the small changes.
Hope that helps...
Thanks for your feedback @Avoca-do and @chriselion.
I did actually get better training results after splitting and normalizing the decimal places as suggested above. Great idea!
Glad you got it working. Closing this issue...