Cylc-flow: cli: play|pause|stop

Created on 27 Oct 2020 · 35Comments · Source: cylc/cylc-flow

Rename the existing workflow lifecycle commands to match the rose suite-run migration proposal and implement the other CLI changes.

I.E. do what it says in the proposal under the following sections:

cylc play
- With the exception of the --re-run option - #3898
cylc pause
cylc release

Source

oliver-sanders

All 35 comments

Not sure why this didn't occur to us before, but cylc resume makes more sense than release, as a counterpart to cylc pause.

cylc hold and cylc release for tasks
cylc pause and cylc resume for the flow

hjoliver on 31 Oct 2020

Update: the proposal is now a little out of date here is the current proposal for cylc play:

Rename cylc run => cylc play
Remove the START_POINT optional positional argument.
Start the flow in "restart" mode if the database is present & delete the cylc restart command.
Unpause the flow if already running and exit(0).

There are two loose ends to tie up:

1) The proposal suggests erroring if a flow is started that has already run to completion.

It is tricky to determine whether a flow has run to completion or not. The simplest solution is just to restart the flow anyway, if it has already run to completion it will just shut down.

2) --ignore-*-cycle-point

`cylc run` has these four options:

```
--initial-cycle-point
--final-cycle-point
--start-cycle-point
--stop-cycle-point
```

And `cylc restart` has these corresponding options:

```
--ignore-initial-cycle-point
--ignore-final-cycle-point
--ignore-start-cycle-point
--ignore-stop-cycle-point
```

Suggest removing the `--ignore*` opts and instead using `--*-cycle-point=ignore` (this is safe as it will fail for a cold-start as it is an invalid cycle point). Setting the `--final-cycle-point` on a restart should implicitly ignore the old one.

Would also suggest rationalising the many aliases for the initial and final cycle point:

```
--initial-cycle-point, --icp
--final-cycle-point, --fcp
--start-cycle-point, --stopcp
--stop-cycle-point, --startcp
```

oliver-sanders on 6 Jan 2021

👍1

Do we want to keep an alias run -> play or restart -> play?

MetRonnie on 14 Jan 2021

Nope, but we should add a "dead end" (see cylc.flow.scripts.cylc)

oliver-sanders on 14 Jan 2021

What happens to warm/cold starts? Am I right in thinking the concept of warm start disappears?

Also, should the alias start -> run be updated to start -> play or removed?

MetRonnie on 14 Jan 2021

The desire is to get rid of warm starts, however, I'm not sure that we can do that right now.

I think we should get rid of the aliases.

oliver-sanders on 14 Jan 2021

👍1

Just to note down some further info Oliver sent me:

The hold/release commands do two things, they can hold/release a flow OR tasks within a flow. These seem similar but are actually very different concepts. hold/release a flow is setting the state of the task pool. hold/release a task is adding/removing xtriggers to tasks (currently implemented a little differently).

The idea is to pull the flow and task features apart.

play/pause (flow state)

hold/release (task state)

MetRonnie on 15 Jan 2021

👍1

What's the difference between

cylc run --warm SUITE [START_CYCLE_POINT] and
cylc run SUITE --startcp=START_CYCLE_POINT?

From looking at the code & docs, they seem to be identical, in which case the --warm option can be dropped?

MetRonnie on 15 Jan 2021

Yep. In fact the command help says --startcp "implies --warm". The term "warm start" comes from the modeling terminology for warm starting a cycling model, of course. But in terms of Cylc and the graph, "start cycle point" probably is a better term - more generic, and descriptive of what Cylc is actually doing. (Note also spawn-on-demand will make warm start obsolete, except perhaps as a convenience where multiple tasks have to be triggered at once to start the flow).

hjoliver on 17 Jan 2021

👍1

It's a much nicer interface, should we drop the --warm option and move the warm documentation onto the --startcp argument?

oliver-sanders on 18 Jan 2021

👍1

Ah ok, so the concept of warm start stays but the cli option goes

MetRonnie on 18 Jan 2021

👍1

Would also suggest rationalising the many aliases for the initial and final cycle point

Does that mean I should get rid of:

--ict (icp)
--initial-point (icp)
--until (fcp)
--final-point (fcp)
--start-point (startcp)
--stop-point (stopcp)

MetRonnie on 22 Jan 2021

👍1

(Although from the changelog, --until was only just added in 8.0a1!)

MetRonnie on 22 Jan 2021

--until goes back a while, the changelog says:

--until=POINT option is now an alias for --final-cycle-point=POINT option.

oliver-sanders on 22 Jan 2021

Should it be possible to hold tasks after a certain point, like it is possible to hold/pause a workflow after a certain point?

Also, what are some good concrete example(s) of cylc hold to put in the docstring, rather than cylc hold REG TASK_GLOBS ... which is identical to the ARGS?

MetRonnie on 25 Jan 2021

Should it be possible to hold tasks after a certain point

This is the --after option?

Yep we want to keep that although it should be an argument of cylc pause rather than cylc hold because it is fiddling with task_pool state rather than that of individual tasks.

I've just noticed that the documentation for cylc hold and the hold mutation do not match. It is definitely a cycle point not a wallclock time (implemented in cylc.flow.scheduler.Scheduler.hold_suite).

Also, what are some good concrete example(s) of cylc hold to put in the docstring

How about this:

# hold mytask at cycle 1234 in myflow
$ cylc hold myflow mytask.1234

# hold all tasks in cycle 1234 in myflow
$ cylc hold myflow '*.1234'

# hold all active instances of mytask in myflow
$ cylc hold myflow 'mytask.*'

oliver-sanders on 25 Jan 2021

👍1

If a workflow has some held tasks (but not all, i.e. it is not paused), should running cylc play release the held tasks or do nothing?

MetRonnie on 25 Jan 2021

Do nothing, cylc play shouldn't go messing with task states.

oliver-sanders on 25 Jan 2021

For cylc play, what should happen to --hold and --hold-point/--hold-after? Should they be replaced by --pause and --pause-point/--pause-after?

Also, in line with the other aliases, should --hold-point/--pause-point change to --hold-cycle-point/--pause-cycle-point and --holdcp/--pausecp?

MetRonnie on 26 Jan 2021

👍1

If a workflow has some held tasks (but not all, i.e. it is not paused), should running cylc play release the held tasks or do nothing?

Also, if a workflow is paused, what should cylc release task.n do? Should it go ahead and release the task(s)? If so, wouldn't that imply the task pool is not held anymore, therefore the workflow is no longer paused?

MetRonnie on 26 Jan 2021

if a workflow is paused, what should cylc release task.n do? Should it go ahead and release the task(s)?

It should release the task which shouldn't run because the scheduler is "held".

So the behaviour for the logical combinations of play/pause and hold/release would be:

$ cylc install x

# start a Scheduler pre-paused
# (a.1 won't run because Scheduler paused)
$ cylc play --pause x

# hold task a.1
# (a.1 won't run because Scheduler paused and task held)
$ cylc hold x a.1

# un-pause the Scheduler
# (a.1 won't run because task held)
$ cylc play x

# release task a.1
# (a.1 will run)
$ cylc release x a.1

However, as you've noticed, that's not how things work at the moment. For now it's a good start just to separate the interfaces, we can fix the internals later, however, if you fancy taking a deeper dive it should be fairly straight forward to change the behaviour of the task_pool hold mechanism.

The tasks are actually submitted from somewhere in the Scheduler, that's the bit you will need to skip if Scheduler.pool.held is True. You may need to do something to get the workflow state set correctly, not sure how that gets done nowadays.

oliver-sanders on 26 Jan 2021

Another thing: I suppose flow.cylc[scheduling]hold after cycle point will have to be deprecated in favour of pause after cycle point

MetRonnie on 26 Jan 2021

👍1

However, as you've noticed, that's not how things work at the moment.

Yeah, it's all done by holding tasks at the moment, needs to be fixed.

hjoliver on 27 Jan 2021

Question about final cycle point: say you have the workflow

[scheduling]
    cycling mode = integer
    final cycle point = 5
    [[graph]]
        P1 = foo

Then you run:

shell $ cylc run temp3 -n --fcp=2 INFO - Cold Start 1 ... INFO - Initial point: 1 INFO - Final point: 2 ... INFO - [foo.1] -triggered off [] INFO - [foo.2] -triggered off [] ... INFO - Suite shutting down - AUTOMATIC

shell $ cylc restart temp3 -n --ignore-final-cycle-point INFO - LOADING suite parameters ... INFO - Initial point: 1 INFO - Final point: 5 INFO - Suite shutting down - AUTOMATIC INFO - DONE

In (2), nothing runs. Is this meant to happen?

Tested on master, d6a09baa93d386a41a7e781d9f372e1347f4d52a

P.S., in (2), in the snipped bit (...), I get this:

INFO - LOADING task action timers
INFO - + foo.1 poll_timer
WARNING - foo.1: task not found, skip
INFO - + foo.1 ['try_timers', 'submission-retry']
WARNING - foo.1: task not found, skip
INFO - + foo.1 ['try_timers', 'execution-retry']
WARNING - foo.1: task not found, skip
INFO - + foo.2 poll_timer
WARNING - foo.2: task not found, skip
INFO - + foo.2 ['try_timers', 'submission-retry']
WARNING - foo.2: task not found, skip
INFO - + foo.2 ['try_timers', 'execution-retry']
WARNING - foo.2: task not found, skip

Why so many warnings?

MetRonnie on 2 Feb 2021

In Cylc7 the workflow would pick up where it left off and continue to the configured FCP (i.e. 5). I think this is an SoD quirk perhaps caused by restarting with an empty task pool?

Think that should go up as a bug.

oliver-sanders on 2 Feb 2021

Ok, as discussed in our meeting, if that's a bug it can be fixed independently of #4040

MetRonnie on 2 Feb 2021

In Cylc7 the workflow would pick up where it left off and continue to the configured FCP (i.e. 5). I think this is an SoD quirk perhaps caused by restarting with an empty task pool?

Think that should go up as a bug.

I'm not so sure that's a bug! The workflow ran to completion at FCP=2. A restart starts from the previous recorded state, which was already "completed", so nothing will happen.

It makes some sense to restart with an extended FCP if you haven't already run to completion, but not if you have.

Also, what if you'd already run special tasks in the last cycle(s)? The only sensible way to do this kind of a "restart" would be to do warm start part-way through the new (extended) graph.

hjoliver on 3 Feb 2021

👍1

Ah sorry, miss read, I thought the flow had been stopped before the fcp. After the cylc play work is finished we would expect the "restart" to be rejected since the flow has already run to completion so I don't think this circumstance should be possible going forward?

oliver-sanders on 3 Feb 2021

I see, the real test of whether --fcp=ignore works is:

[scheduling]
    runahead limit = P1
    cycling mode = integer
    final cycle point = 5
    stop after cycle point = 2
    [[graph]]
        P1 = foo

Then

$ cylc play temp3 -n --fcp=3 runs to cycle point 2 and shuts down
$ cylc play temp3 -n --fcp=ignore restarts at cycle point 3 and runs to 5

MetRonnie on 3 Feb 2021

$ cylc play temp3 -n --fcp=ignore restarts at cycle point 3 and runs to 5

It wouldn't stop at point 5, because you told it to ignore the fcp.

hjoliver on 4 Feb 2021

But the ignore means ignore the fcp stored in the DB only, reverting to whatever is in flow.cylc?

From cylc restart --help:

                        Ignore the final cycle point in the suite run
                        database. If one is specified in the suite definition
                        it will be used, however.

MetRonnie on 4 Feb 2021

if a workflow is paused, what should cylc release task.n do? Should it go ahead and release the task(s)?

It should release the task which shouldn't run because the scheduler is "held".

(full comment here)

Just to seek some further clarity on that - the idea of "cylc pause pauses the workflow by holding all tasks" should become "it pauses the workflow"?

MetRonnie on 8 Feb 2021

Not sure where the "pauses the workflow by holding all tasks" quote comes from?

We would like cylc pause to pause the workflow but not to pause any of the tasks within it.

oliver-sanders on 8 Feb 2021

Not sure where the "pauses the workflow by holding all tasks" quote comes from?

That was my initial understanding of "pause" based on the pre-existing behaviour

MetRonnie on 8 Feb 2021

Actually, if pause does not hold tasks, how can we pause after a particular cycle point? (E.g. cylc play --pause-after=POINT, cylc pause --after=POINT, flow.cylc[scheduling]pause after cycle point) Depending on the runahead limit, there would be nothing to stop tasks running beyond the pause point if any one task still hasn't reached it

MetRonnie on 8 Feb 2021

Was this page helpful?

0 / 5 - 0 ratings

Related issues

Update Protobuf to 3.12+ - Fields with default value not set in deltas

dwsutherland · 3Comments

Revisit CYLC_DIR use

kinow · 3Comments

cylc review: Server 404 error when trying Display Options for suites contain special characters

kinow · 3Comments

Replace scheduler.py main loop by asyncio loop

kinow · 4Comments

Generalise [runtime] metadata items

hjoliver · 5Comments