After discussion on Riot:
We should consider re-aligning max active cycle points to match the SoD implementation (and the configuration name).
Context:
The runahead limit and max active cycle points configurations both limit the span between the oldest and youngest active cycle points.
From the docs for max active cycle points:
It allows up to N (default 3) consecutive cycle points to be active at any time, adjusted up if necessary for any future triggering.
The key word here is consecutive.
So runahead limit and max active cycle points actually both do the same thing, just that one uses integer semantics and the other, date-time.
Issues:
max active cycle points doesn't gel well with SoD. [HO: triggering a reflow in past cycles will completely stop new cycles, depending on the limit; same goes for SoS really, if you insert-and-trigger an old task]max active cycle points is misleading since it's nothing to do with the number of active cycle points.Questions:
max active cycle points configuration in with runahead limit?runahead limit = P1Drunahead limit = P3 (equivalent to [current consecutive] max active cycle points = 3)max active cycle points? yesQuestions
The use case for non-consecutive max active cycle points is pretty general, it seems to me: we should be aiming for maximum concurrency within prescribed limits; and why would we stop two tasks from running concurrently when they are (say) 3 cycle points apart but allow them to do it if (say) 2 cycle points apart?
If we allow non-consecutive max active cycle points, should we sort active and waiting(runahead) tasks into cycle point order and prioritize older points? (Older would tend to allow a reflow to get done first, and if you trigger a reflow in the first place that is probably what you want?)
should we sort active and waiting(runahead) tasks into cycle point order and prioritize older points
I would have thought so.
There's also the question of defaults, personally I would prefer runahead limit and max active cycle points to default to None. Unfortunately that might not be advisable due to graphs like this:
c[-P1] => a
@clock => a
a & b => c => d & e
If both runahead limit and max active cycle points were set to None then b would attempt run in every cycle simultaneously? To me this makes sense, is easier to explain and is a simple case of "you got what you asked for", however, for historical reasons, the change might be dangerous at this stage.
Perhaps safe defaults might be:
runahead limit = P5max active cycle points = NoneAgreed in video call, for now lets go with the following safe defaults and perhaps reconsider later:
runahead limit = P5max active cycle points = NoneWe should document this "safety feature" and make it clear what the alternatives are.
Post a new issue to consider removing the default limit. (cons? xtriggers on fast spawn-ahead tasks?)
What should happen to max active cycle points in the meantime while the behaviour is moved to runahead limit? Should it be left in the config spec with a todo note?
@MetRonnie I don't think we decided to remove max active cycle points which seems to be what your question is assuming?
We just decided to distinguish it from runahead limit so they become (as originally intended) two different ways of limiting the runahead, instead of a single way with two different names.
The idea is you would use one or the other, in a workflow. Setting to None (if I understand our discussion above, in retropect) equates to "no limiting".
I guess we haven't decided what to do if a user sets both limits at the same time. Probably just fail validation.
I think Ronnie is asking what we should do with the max active cycle points configuration item until we've got a nonconsecutive implementation (since the current functionality will be configured by runahead limit).
I think it's fine to remove/comment it for now pending re-implementation but it doesn't really matter much until 8.0.0.
Ah, OK my bad :confounded: Also think we could just leave it as-is until the proper max active is implemented (which is probably much easier than runahead limit since you can simply count active cycles).
There is currently a backwards compatibility in runahead limit where if you set it to a number n, it takes that to mean PTnH. I'm guessing we should remove that and not allow numbers anymore? We could take the number to mean Pn instead, but then that's a bit of a subtle change that could confuse users?
There is currently a backwards compatibility in runahead limit where if you set it to a number n, it takes that to mean PTnH
Wow, the more you look into things, sounds like Cylc5 legacy to me 馃あ.
Looks like this may be undocumented, however, will have to check if it is used 馃槧. Might be worth checking that this functionality still works before worrying about it.
It should be pretty easy to continue to support this:
^P[\d]+$ - integer cycles^P.*[TYMDW-].*$ - time interval^\d+$ - time interval (translate to PT<x>H)We would document it as deprecated, however, if possible, yes, it would be best to remove this support.
It's just 2-3 lines of code https://github.com/cylc/cylc-flow/blob/1035c4e92f123db96e7130b64e89c7b13542a4fe/cylc/flow/config.py#L1151-L1157
I've just gone through MO suites and found ~a few~ a lot of cases of this old integer format, as it's such a simple translation I think we should probably continue to support it but document it as deprecated and log a deprecation warning when used.
[edit] I found a total of 145 suites using this format (without having to go to source repositories), quite a few were setting runahead limit = 0 which is curious, most of the remainder were set to either 1, 24 or 48.
I'm not sure if I've understood this correctly.
The use case for non-consecutive
max active cycle pointsis pretty general, it seems to me: we should be aiming for maximum concurrency within prescribed limits; and why would we stop two tasks from running concurrently when they are (say) 3 cycle points apart but allow them to do it if (say) 2 cycle points apart?
Say you have the workflow
[scheduling]
cycling mode = integer
max active cycle points = 2
[[graph]]
P3 = foo
Are foo.1 and foo.4, which are 3 cycle points apart, not consecutive? Running the workflow will spawn foo.1 and foo.4, wait for them to finish, then spawn foo.7 and foo.10, etc.
Are foo.1 and foo.4, which are 3 cycle points apart, not consecutive?
These would be considered "consecutive" cycles (because there is nothing in-between).
With the renaming of the current (consecutive) max active cycle points to runahead limit we aren't expecting a change in behaviour:
[scheduling]
cycling mode = gregorian
# before
runahead limit = PT1D
# after
runahead limit = PT1D
[scheduling]
cycling mode = <whatever>
# before
max active cycle points = 2
# after
runahead limit = P2
Though good point, P2 would be very confusing in this example as P2 refers to a duration of two. Perhaps this should be 2 rather than P2?
Though good point, P2 would be very confusing in this example as P2 refers to a duration of two. Perhaps this should be 2 rather than P2?
The trouble is that would break backwards compatibility for being able to specify a raw number of hours. However I'm starting to realise/agree that Pn isn't a good way to specify the limit.
I'm trying to understand the difference between consecutive and non-consecutive implementations of max active cycle points.
[scheduling]
cycling mode = integer
max active cycle points = 3
[[graph]]
P1 = foo & good => bar
[runtime]
[[foo]]
script = if [[ "$CYLC_TASK_JOB" == '1/foo/01' ]]; then false; else true; fi
(In this workflow foo fails on the first cycle point only)
With the current consecutive implementation, this workflow has foo.1 fail leaving bar.1 waiting, but 2 and 3 succeed. But then 4 or higher are not spawned and the workflow stalls.
I take it, with a non-consecutive implementation, after 2 and 3 succeed, 4 and 5 would spawn, then after they succeed 6 and 7 would spawn etc? Is that understanding correct?
I take it, with a non-consecutive implementation, after 2 and 3 succeed, 4 and 5 would spawn, then after they succeed 6 and 7 would spawn etc? Is that understanding correct?
Yep.
https://github.com/cylc/cylc-flow/pull/3848#discussion_r498217790
The new non-consecutive implementation (TBD) needs to have a new name (not max active cycle points) because many existing flows may depend on the current consecutive implementation.
Maybe maximum active cycle points?
active cycle point limit?
Most helpful comment
active cycle point limit?