Ml-agents: Agent property name of "Decision Frequency" is comfusing

Created on 19 May 2018 · 4Comments · Source: Unity-Technologies/ml-agents

'Decision Frequency" seems to mean "step count per decision" by its name.

but brain "stacked vector" property can delay "step count per decision" by stacking observations per next decision making

if "decision frequency" is 1 and "stacked vector" is 2, then decision will be maked every 2 step.

'Decision Frequency" associate with "step count per CollectObservations() call"

so "Decision Frequency" should be changed to "Observation Frequency"

document

"brain property"
https://github.com/Unity-Technologies/ml-agents/blob/master/docs/Learning-Environment-Design-Brains.md#brain-properties

"agent property"
https://github.com/Unity-Technologies/ml-agents/blob/master/docs/Learning-Environment-Design-Agents.md#agent-properties

discussion

Source

green4you

Most helpful comment

The way I understand decision frequency is that for a frequency of, let's say 5, AgentAction is invoked 5 consecutive times with identical vectorAction values. You could implement the same behavior by enabling on demand decisions and call RequestDecision every 5th time your agent's FixedUpdate method executes. You'd have to store the vectorAction values and apply them in FixedUpdate.

I think it's an interesting possibility to make the decision frequency itself part of the agent's action space this way.

Add a vectorAction value n, round and clamp it to a range of e.g. 1 - 9
Count down from n in FixedUpdate to call RequestDecision every nth time
Divide rewards by n

I was playing around with this and sometimes saw n converging towards a value associated with the highest reward. Might be helpful in situations where you'd want an adaptive decision frequency.

mbaske on 22 May 2018

👍3

All 4 comments

Decision frequency means : "The number of steps between decision requests." Here, a step means a fixed update, this means that it the Decision Frequency is XXX, there will be XXX-1 fixed updates without a new decision between each fixed update with a new decision.
I think you are getting confused with Stacked Vectors because we did not specify that observations can be reused in decisions.
Imagine there is a decision at every step : at the first step, the observation is __a__, at the second it is __b__, at the third __c__, etc...
If stacked vectors is set to 3, then the observation set at the first step will be [0,0,__a__] then [0,__a__, __b__] then [__a__,__b__,__c__] then [__b__,__c__,__d__]...etc
The stacked vectors __DO NOT__ delay the decision making.
It is true that observations are only collected before a decision must be made. For this reason, it could be called "Observation Frequency" but we preferred decision frequency because it is more descriptive.
I hope this helps.

vincentpierre on 20 May 2018

👍2

I think it's an interesting possibility to make the decision frequency itself part of the agent's action space this way.

Add a vectorAction value n, round and clamp it to a range of e.g. 1 - 9
Count down from n in FixedUpdate to call RequestDecision every nth time
Divide rewards by n

I was playing around with this and sometimes saw n converging towards a value associated with the highest reward. Might be helpful in situations where you'd want an adaptive decision frequency.

mbaske on 22 May 2018

👍3

Thank you for the discussion. We are closing this issue due to inactivity. Feel free to reopen it if you’d like to continue to discussion though.