'Decision Frequency" seems to mean "step count per decision" by its name.
but brain "stacked vector" property can delay "step count per decision" by stacking observations per next decision making
if "decision frequency" is 1 and "stacked vector" is 2, then decision will be maked every 2 step.
'Decision Frequency" associate with "step count per CollectObservations() call"
so "Decision Frequency" should be changed to "Observation Frequency"
"brain property"
https://github.com/Unity-Technologies/ml-agents/blob/master/docs/Learning-Environment-Design-Brains.md#brain-properties
"agent property"
https://github.com/Unity-Technologies/ml-agents/blob/master/docs/Learning-Environment-Design-Agents.md#agent-properties
Decision frequency means : "The number of steps between decision requests." Here, a step means a fixed update, this means that it the Decision Frequency is XXX, there will be XXX-1 fixed updates without a new decision between each fixed update with a new decision.
I think you are getting confused with Stacked Vectors because we did not specify that observations can be reused in decisions.
Imagine there is a decision at every step : at the first step, the observation is __a__, at the second it is __b__, at the third __c__, etc...
If stacked vectors is set to 3, then the observation set at the first step will be [0,0,__a__] then [0,__a__, __b__] then [__a__,__b__,__c__] then [__b__,__c__,__d__]...etc
The stacked vectors __DO NOT__ delay the decision making.
It is true that observations are only collected before a decision must be made. For this reason, it could be called "Observation Frequency" but we preferred decision frequency because it is more descriptive.
I hope this helps.
The way I understand decision frequency is that for a frequency of, let's say 5, AgentAction is invoked 5 consecutive times with identical vectorAction values. You could implement the same behavior by enabling on demand decisions and call RequestDecision every 5th time your agent's FixedUpdate method executes. You'd have to store the vectorAction values and apply them in FixedUpdate.
I think it's an interesting possibility to make the decision frequency itself part of the agent's action space this way.
vectorAction value n, round and clamp it to a range of e.g. 1 - 9FixedUpdate to call RequestDecision every nth timeI was playing around with this and sometimes saw n converging towards a value associated with the highest reward. Might be helpful in situations where you'd want an adaptive decision frequency.
Thank you for the discussion. We are closing this issue due to inactivity. Feel free to reopen it if you鈥檇 like to continue to discussion though.
This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.
Most helpful comment
The way I understand decision frequency is that for a frequency of, let's say 5,
AgentActionis invoked 5 consecutive times with identicalvectorActionvalues. You could implement the same behavior by enablingon demand decisionsand callRequestDecisionevery 5th time your agent'sFixedUpdatemethod executes. You'd have to store thevectorActionvalues and apply them inFixedUpdate.I think it's an interesting possibility to make the decision frequency itself part of the agent's action space this way.
vectorActionvalue n, round and clamp it to a range of e.g. 1 - 9FixedUpdateto callRequestDecisionevery nth timeI was playing around with this and sometimes saw n converging towards a value associated with the highest reward. Might be helpful in situations where you'd want an adaptive decision frequency.