An incredibly bizarre bug which has taken ages to reduce to a simple example!
Normally a white-space character placed after a line continuation character will cause cylc validate to fail for example:
[scheduling]
[[dependencies]]
graph = """ # note the whitespace after the \
foo \
=> bar
"""
Make sure your text editor doesn't automatically strip trailing white-space on save
However an utterly bizarre combination of factors can cause a malformed line continuation character to pass validation:
[scheduling]
initial cycle point = 20000101T06
final cycle point = 20010101T18
[[dependencies]]
[[[ T00 ]]]
# note the whitespace after the \
graph = """
foo | bar \
=> baz & qux
pub
"""
[[[ T12 ]]]
graph = """
qux
baz
"""
This suite will validate OK and run normally until the suite suddenly falls over with cryptic traceback:
File "cylc/lib/cylc/scheduler.py", line 234, in start
self.run()
File "cylc/lib/cylc/scheduler.py", line 1385, in run
if self.should_process_tasks():
File "cylc/lib/cylc/scheduler.py", line 1519, in should_process_tasks
if itask.is_ready(now):
File "cylc/lib/cylc/task_proxy.py", line 358, in is_ready
return not (self.is_waiting_clock(now) or self.is_waiting_prereqs())
File "cylc/lib/cylc/task_proxy.py", line 393, in is_waiting_prereqs
any(not pre.is_satisfied() for pre in self.state.prerequisites) or
File "cylc/lib/cylc/task_proxy.py", line 393, in <genexpr>
any(not pre.is_satisfied() for pre in self.state.prerequisites) or
File "cylc/lib/cylc/prerequisite.py", line 174, in is_satisfied
self._all_satisfied = self._conditional_is_satisfied()
File "cylc/lib/cylc/prerequisite.py", line 194, in _conditional_is_satisfied
'"%s"' % self.get_raw_conditional_expression())
TriggerExpressionError: '"foo.20000102T0000+01 succeeded|bar:succeed\\"'
The traceback occurs because the trigger expression is malformed in the graph parser, later at runtime when the expression is evaluated all hell breaks loose.
This is a nasty one on the user side as it's pretty much impossible to debug. What really confuses me is the role of cycling in this bug, try changing the time at the initial cycle point or combine the T12 and T00 recurrences, it should then fail validation.
Weird. :grimacing:
Most likely because parsec.fileparse._concatenate is not handling trailing spaces after continuation line properly. Should we:
I would suggest that raising an error is probably safer but would result in a lot of people having to modify previously valid suites in order to get them to run.
@oliver-sanders So which way should we go? I am leaning towards the forgiving solution - to cause the least trouble for users.
If we went the "forgiving" route where would we be stripping trailing whitespace from?
I think we shouldn't strip it from *script (and possibly also [environment]*) settings?
I suppose we can probably get rid of anything that matches \\*\s*\n somewhere in the graph parser?
Scope of this bug may extend beyond the graph, also for the sake of consistency it wouldn't be great if you can "get away" with sloppiness one place in the file but not in another...
Whitespace trailing a backslash is always a bug, isn't it? If so, I vote raise an error. It is an easy thing for users to fix. And if they made an error, they shouldn't complain to us about having to fix it!
Silently ignore trailing spaces after continuation lines?
The version that just lops off a single trailing whitespace char
Raise an error on lines with trailing spaces after continuation lines?
I suppose we can probably get rid of anything that matches
\\*\s*\nsomewhere in the graph parser?
\n is not actually being kept for very long after the files have been loaded: The lines are assigned to a list but rstrip('\n') is used immediately.
Most helpful comment
Whitespace trailing a backslash is always a bug, isn't it? If so, I vote raise an error. It is an easy thing for users to fix. And if they made an error, they shouldn't complain to us about having to fix it!