[UPDATE: THIS EXAMPLE SUITE AND TRACEBACK IS WRONG - SEE BELOW]
[cylc]
cycle point format = %Y-%m-%dT%H:%MZ # ERROR
#cycle point format = %Y%m%dT%H%M # OK
[scheduling]
initial cycle point = 2020
[[dependencies]]
[[[T00]]]
graph = "foo"
Here's the error:
$ cylc run --no-detach five
._.
| | The Cylc Suite Engine [8.0a0]
._____._. ._| |_____. Copyright (C) 2008-2019 NIWA
| .___| | | | | .___| & British Crown (Met Office) & Contributors.
| !___| !_! | | !___. _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
!_____!___. |_!_____! This program comes with ABSOLUTELY NO WARRANTY.
.___! | It is free software, you are welcome to
!_____! redistribute it under certain conditions;
see `COPYING' in the Cylc source distribution.
2019-09-16T11:26:14+12:00 WARNING - deprecated graph items were automatically upgraded in 'suite definition':
2019-09-16T11:26:14+12:00 WARNING - * (8.0.0) [scheduling][dependencies][X][graph] -> [scheduling][graph][X] - for X in:
T00
2019-09-16T11:26:14+12:00 INFO - Suite server: url=tcp://niwa-1007885.niwa.local:43082/ pid=13249
2019-09-16T11:26:14+12:00 INFO - Run: (re)start=0 log=1
2019-09-16T11:26:14+12:00 INFO - Cylc version: 8.0a0
2019-09-16T11:26:14+12:00 INFO - Run mode: live
2019-09-16T11:26:14+12:00 INFO - Initial point: 2020-01-01T00:00Z
2019-09-16T11:26:14+12:00 INFO - Final point: None
2019-09-16T11:26:14+12:00 INFO - Cold Start 2020-01-01T00:00Z
2019-09-16T11:26:14+12:00 ERROR - R/2020-01-01T12:00Z/P1D, point format %Y-%m-%dT%H:%MZ: equal adjacent points: 2020-01-01T12:00Z => 2020-01-01T12:00Z.
Traceback (most recent call last):
File "/home/oliverh/cylc/cylc-flow/venv/lib/python3.7/site-packages/cylc/flow/scheduler.py", line 253, in start
self.run()
File "/home/oliverh/cylc/cylc-flow/venv/lib/python3.7/site-packages/cylc/flow/scheduler.py", line 1523, in run
if self.pool.release_runahead_tasks():
File "/home/oliverh/cylc/cylc-flow/venv/lib/python3.7/site-packages/cylc/flow/task_pool.py", line 284, in release_runahead_tasks
point = sequence.get_next_point(point)
File "/home/oliverh/cylc/cylc-flow/venv/lib/python3.7/site-packages/cylc/flow/cycling/iso8601.py", line 544, in get_next_point
self._check_and_cache_next_point(point, next_point)
File "/home/oliverh/cylc/cylc-flow/venv/lib/python3.7/site-packages/cylc/flow/cycling/iso8601.py", line 554, in _check_and_cache_next_point
next_point, point
cylc.flow.exceptions.SequenceDegenerateError: R/2020-01-01T12:00Z/P1D, point format %Y-%m-%dT%H:%MZ: equal adjacent points: 2020-01-01T12:00Z => 2020-01-01T12:00Z.
2019-09-16T11:26:14+12:00 CRITICAL - Suite shutting down - R/2020-01-01T12:00Z/P1D, point format %Y-%m-%dT%H:%MZ: equal adjacent points: 2020-01-01T12:00Z => 2020-01-01T12:00Z.
2019-09-16T11:26:15+12:00 INFO - DONE
SequenceDegenerateError: R/2020-01-01T12:00Z/P1D, point format %Y-%m-%dT%H:%MZ: equal adjacent points: 2020-01-01T12:00Z => 2020-01-01T12:00Z.
@dwsutherland the following does not set sub_num if submit_num is None:
https://github.com/cylc/cylc-flow/blob/master/cylc/flow/job_pool.py#L186-L188
The fact that this only seems to be triggered by a (valid) cycle point format with a : between hours and minutes suggests another error somewhere in splitting an ID perhaps?
The ID containing date is parsed for state separated by :
https://github.com/cylc/cylc-flow/blob/5d4db47dfe437397a38bdecfe715110f1bdccadf/cylc/flow/job_pool.py#L170-L173
Also in the GraphQL schema:
https://github.com/cylc/cylc-flow/blob/5d4db47dfe437397a38bdecfe715110f1bdccadf/cylc/flow/network/schema.py#L82-L85
https://github.com/cylc/cylc-flow/blob/5d4db47dfe437397a38bdecfe715110f1bdccadf/cylc/flow/network/schema.py#L118-L121
Same issue with the old/original one I think:
https://github.com/cylc/cylc-flow/blob/5d4db47dfe437397a38bdecfe715110f1bdccadf/cylc/flow/task_pool.py#L1345-L1348
But maybe only used for external input? so would still be a bug for CLI
We may have to chose a different separator for the CLI version (in addition to GraphQL API related)... as name.point:state wouldn't work with that format also
Simplest problem first, this code https://github.com/cylc/cylc-flow/blob/master/cylc/flow/job_pool.py#L186-L188 has special handling to set sub_num for the "not None" case, but does not set a value at all for the "None" case - hence the traceback. So that could be fixed regardless of solving the separator issue. Or if it should never be possible to get None in this function (maybe not, as every job has a submit number?) then the code should reflect that (e.g. don't need the if not None test).
Reproduced here as well.
My opinion is that we should avoid : in cycle point, because the cycle point is part of the path to ~/cylc-run/SUITE/log/job/CYCLE/TASK/SUBMIT_NUM/job, etc. With : in CYCLE, it can affect interpretation of variables such as PATH, which uses : as separator (in Unix-like OSes).
@matthewrmshin - you're right about PATH, but a) YYYY-MM-DDThh:mm is a valid ISO date-time format that we accept; b) it works as far as cylc is concerned - except now on master; and c) I've been known to use it myself if only for demo purposes because it is a more readable and recognizable date-time format if you need to go all the way to minutes.
That said, I'm not entirely averse to banning it, but I suppose we had better be reasonably sure that no one is already using it in real suites.
I agree. Banning it would be a shame, as it is clear and legitimate ISO8601 syntax. In which case, we should definitely fix this, and perhaps open a follow-on issue to improve the system for the Unix PATH problem.
Simplest problem first, this code https://github.com/cylc/cylc-flow/blob/master/cylc/flow/job_pool.py#L186-L188 has special handling to set
sub_numfor the "not None" case, but does not set a value at all for the "None" case - hence the traceback. So that could be fixed regardless of solving the separator issue. Or if it should never be possible to get None in this function (maybe not, as every job has a submit number?) then the code should reflect that (e.g. don't need theif not Nonetest).
@hjoliver - Please read the above comment again, and consider what happens when passing:
2020-01-01T00:00Z/foo/01
into
if ":" in item:
head, state_str = item.rsplit(":", 1)
else:
head, state_str = (item, None)
You get an ID of 2020-01-01T00, a state of 00Z/foo/01, and [hence] a submit number None.
So the problem is bigger than job_pool.py ..The colon in the cycle point format breaks the GraphQL API, and the CLI (and perhaps PATH?)..
So if we want : in the cycle point format, then we need to at least consider changing it as a delimiter of ID and state.
@dwsutherland - sorry, I wrote the above in a bit of a hurry, and the initial description as it turns out (see below!).
I certainly agree on the wider implications (separator) and I commented to that effect. But my point just above was that the following code construct is inherently wrong regardless of that:
if foo is not None:
sub_num = whatever
return sub_num
if sub_num is not defined higher up. The test implies foo can be None (else no need to test) in which case sub_num is undefined.
I can now see (sorry, I had a crazy busy day!) that I cut-n-pasted the wrong traceback into the description. I actually originally got an error due to sub_num being undefined.
Here's the correct traceback:
$ cylc run --no-detach bar
._.
| | The Cylc Suite Engine [8.0a0]
._____._. ._| |_____. Copyright (C) 2008-2019 NIWA
| .___| | | | | .___| & British Crown (Met Office) & Contributors.
| !___| !_! | | !___. _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
!_____!___. |_!_____! This program comes with ABSOLUTELY NO WARRANTY.
.___! | It is free software, you are welcome to
!_____! redistribute it under certain conditions;
see `COPYING' in the Cylc source distribution.
2019-09-16T22:02:13+12:00 INFO - Suite server: url=tcp://localhost:43034/ pid=18186
2019-09-16T22:02:13+12:00 INFO - Run: (re)start=0 log=1
2019-09-16T22:02:13+12:00 INFO - Cylc version: 8.0a0
2019-09-16T22:02:13+12:00 INFO - Run mode: live
2019-09-16T22:02:13+12:00 INFO - Initial point: 2020-01-01T00:00
2019-09-16T22:02:13+12:00 INFO - Final point: None
2019-09-16T22:02:13+12:00 INFO - Cold Start 2020-01-01T00:00
2019-09-16T22:02:13+12:00 INFO - [foo.2020-01-01T00:00] -submit-num=01, owner@host=localhost
2019-09-16T22:02:13+12:00 INFO - [foo.2020-01-01T00:00] -triggered off []
2019-09-16T22:02:14+12:00 ERROR - local variable 'sub_num' referenced before assignment
Traceback (most recent call last):
File "/home/oliverh/cylc/cylc-flow/venv/lib/python3.7/site-packages/cylc/flow/scheduler.py", line 253, in start
self.run()
File "/home/oliverh/cylc/cylc-flow/venv/lib/python3.7/site-packages/cylc/flow/scheduler.py", line 1526, in run
self.proc_pool.process()
File "/home/oliverh/cylc/cylc-flow/venv/lib/python3.7/site-packages/cylc/flow/subprocpool.py", line 182, in process
self._proc_exit(proc, "", ctx, callback, callback_args)
File "/home/oliverh/cylc/cylc-flow/venv/lib/python3.7/site-packages/cylc/flow/subprocpool.py", line 173, in _proc_exit
self._run_command_exit(ctx, callback, callback_args)
File "/home/oliverh/cylc/cylc-flow/venv/lib/python3.7/site-packages/cylc/flow/subprocpool.py", line 368, in _run_command_exit
callback(ctx, *callback_args)
File "/home/oliverh/cylc/cylc-flow/venv/lib/python3.7/site-packages/cylc/flow/task_job_mgr.py", line 707, in _submit_task_jobs_callback
{self.batch_sys_mgr.OUT_PREFIX_COMMAND: self._job_cmd_out_callback}
File "/home/oliverh/cylc/cylc-flow/venv/lib/python3.7/site-packages/cylc/flow/task_job_mgr.py", line 521, in _manip_task_jobs_callback
callback(suite, itask, ctx, line)
File "/home/oliverh/cylc/cylc-flow/venv/lib/python3.7/site-packages/cylc/flow/task_job_mgr.py", line 732, in _submit_task_job_callback
self.job_pool.set_job_attr(job_d, 'batch_sys_job_id', items[3])
File "/home/oliverh/cylc/cylc-flow/venv/lib/python3.7/site-packages/cylc/flow/job_pool.py", line 129, in set_job_attr
point, name, sub_num, _ = self.parse_job_item(job_d)
File "/home/oliverh/cylc/cylc-flow/venv/lib/python3.7/site-packages/cylc/flow/job_pool.py", line 188, in parse_job_item
return (point_str, name_str, sub_num, state_str)
UnboundLocalError: local variable 'sub_num' referenced before assignment
2019-09-16T22:02:14+12:00 CRITICAL - Suite shutting down - local variable 'sub_num' referenced before assignment
And the correct example suite (putting Z on the end of the format resulted in the degenerate cycle points error in the description, which is not what I was trying to highlight):
[cylc]
cycle point format = %Y-%m-%dT%H:%M # ERROR
#cycle point format = %Y%m%dT%H%M # OK
[scheduling]
initial cycle point = 2020
[[dependencies]]
[[[T00]]]
graph = "foo"
I understand that sub_num shouldn't ever be None, and the code:
if foo is not None:
sub_num = whatever
return sub_num
needs revised, but it is a mere symptom of the problem..
It results from the : in the cycle point which breaks the formats:
point/name/sub_num:state
point/name:state
name.point:state
Which is parsed in number of places:
CLI (via task_pool.py method)
GraphQL API (via schema.py)
and in the job_pool.py which may not even need to parse for state.. so to fix the job pool method we could probably just remove the block:
if ":" in item:
head, state_str = item.rsplit(":", 1)
else:
head, state_str = (item, None)
But the CLI and GraphQL API will still be broken with : in the cycle point format.
Well we entirely agree then (see my final sentence in the description, after all) - I think you may be somewhat over-interpreting my minor point about the sub_num code block :grin: I was just commenting on that because it was the direct cause of the original error I got (if I had posted the correct traceback in the first place, we might have saved ourselves this whole conversation :grin: )
To fix, I guess we'll need to have an extra if - so if state_str is not in the list of valid task states, we should use the head, state_str = (item, None) logic (like in the else clause)?
If we will need to revisit the separator, perhaps we can simply treat the information as query parameters and URL encode everything for the transport layer? (so following RFC 3986 and with native support in Python, avoiding the need to write any code for this? I
So instead of 2020-01-01T00:00Z/foo/01, have cyclepoint=2020-01-01T00:00Z&task=foo&submitNum=01. Which could be URL encoded ascyclepoint%3D2020-01-01T00%3A00Z%26id%3Dfoo%2F01`. Less human friendly, but avoids having to parse it on the client.
And then the separator is defined only on client layer? Not sure if that makes sense as I am not completely aware of how/where/why/etc this separator is used (except I see this on the GraphQL response). But if that's possible, at least we would be using something natively supported by Python, and avoiding the need to write code to handle this...
So far I think we haven't had the need to parse it, as the same information is present in the rest of the response. The only place I know - off the top of my head - that this is used, is as ID of each HTML element created by Cylc UI for the tree component I think.
To fix, I guess we'll need to have an extra
if- so ifstate_stris not in the list of valid task states, we should use thehead, state_str = (item, None)logic (like in theelseclause)?
Yes that would work, since it's a right split.. We can apply that to schema.py functions and task_pool.py method...
Again, for the job_pool.py method I don't think we need to parse for state (I think!)..
So instead of 2020-01-01T00:00Z/foo/01, have cyclepoint=2020-01-01T00:00Z&id=foo/01. Which could be URL encoded as cyclepoint%3D2020-01-01T00%3A00Z%26id%3Dfoo%2F01. Less human friendly, but avoids having to parse it on the client.
I think the parsing of this format is always server-side, but it can be provided in mutations & queries client side (owner|workflow|cycle|name|sub_num:status).
To fix, I guess we'll need to have an extra
if- so ifstate_stris not in the list of valid task states, we should use thehead, state_str = (item, None)logic (like in theelseclause)?
The other thing to watch out for .. If it's not a valid state and we set it to None, then for cylc trigger that would actually trigger all foo instead of the intended (i.e miss spelt) state.. (which is not a desirable behaviour)
Which actually means.. we may have to choose a different delimiter..
@kinow The separator is currently used by various CLI commands. E.g.: you can do:
cylc trigger mysuite 'FAM1.20190916T1200Z:failed'
which will trigger all instances of failed tasks in FAM1 family in the 20190916T1200Z cycle.
The colon was used to look a bit like the syntax in suite.rc graphs, e.g. FAM1:failed => task2. I guess we (probably me) must have forgotten about the potential usage of : in time strings.
Thanks @matthewrmshin and @dwsutherland for the background on the issue :+1: will keep lurking here in the discussion :eyes:
You should be able to do this:
$ cylc trigger baz foo.201909*T00:00Z<delim>failed
and it target those failed tasks, but this:
$ cylc trigger baz foo.201909*T00:00Z<delim>asdfg
or
$ cylc trigger baz foo.201909*T0000Z<delim>asdfg
and it target no tasks (not all), also this:
$ cylc trigger baz foo.201909*T00:00Z
and it target all foo in the month.
Hence can't use : for state delimiter, and it also be in datetime.
Moved to next release after discussion on Riot, as this issue needs more discussion and we won't have time to finish it for 8.0a1.
job_pool.py method.. This is a fix for only one aspect of the problem of : in the datetime.For curiosity I tried the suite (without the Z at the end of the cycle point format), on master, and it worked fine. Is this issue still a problem?
Started jupyterhub, and can successfully visualize the workflow with no issues (no tracebacks anywhere at least).


I problem may still arise if the ID arg is parsed on a root query:
if ':' in item:
head, state = item.rsplit(':', 1)
else:
head, state = (item, None)
But that problem already exists with CLI interactions with the task pool:
https://github.com/cylc/cylc-flow/blob/5d4db47dfe437397a38bdecfe715110f1bdccadf/cylc/flow/task_pool.py#L1345-L1348
But maybe the problem in the cyclepoint got fixed in some other PR/commit?
What I'm saying is; the use of : in the API/CLI args for delimiter of state is still an issue:
https://github.com/cylc/cylc-flow/issues/3369#issuecomment-532520080
Potentially resolved by a downstream PR of #3931
Update:
task.cycle:failed.workflow*glob:running (https://github.com/cylc/cylc-flow/issues/3592)Suggest creating a validator for the cycle point format:
(note subclasses are auto-documenting, use .. autoclass:: in cylc-doc)
Note that at least one test uses a colon in the cycle point format which will have to be changed.
@dwsutherland are you working on this one? We should be able to make progress before the universal ID stuff happens.
Most helpful comment
You should be able to do this:
$ cylc trigger baz foo.201909*T00:00Z<delim>failedand it target those failed tasks, but this:
$ cylc trigger baz foo.201909*T00:00Z<delim>asdfgor
$ cylc trigger baz foo.201909*T0000Z<delim>asdfgand it target no tasks (not all), also this:
$ cylc trigger baz foo.201909*T00:00Zand it target all
fooin the month.Hence can't use
:for state delimiter, and it also be in datetime.