Flux-core: flux-shell: implement output modes

Created on 18 Jul 2019 · 12Comments · Source: flux-framework/flux-core

Implement stdio modes beyond simply capturing stdout + stderr to the KVS on job exit.

Options can be passed via jobspec attributes.system.shell.iomode or similar.

Some options from wreck for consideration in the new system are

--label-io::
-l::
        Prepend stdout and stderr output lines with the task id to which
        the output belongs.

--output='FILENAME'::
-O 'FILENAME'::
        Duplicate stdout and stderr from tasks to a file or files. 'FILENAME'
        is optionally a mustache template which expands the keys 'id', 'cmd'
        and 'taskid'. (e.g. '--output=flux-{{id}}.out')

--error='FILENAME'::
-E 'FILENAME'::
        Send stderr to a different location than stdout.

--input='HOW'::
-i 'HOW'::
        Indicate how to deal with stdin for tasks. 'HOW' is a list of 'src:dst'
        pairs where 'src' is a source 'FILENAME' which may optionally be a
        mustache template as in `--output`, or the special term `stdin` to
        indicate the stdin from a front end program, and 'dst' is a list of
        taskids or `*` or `all` to indicate all tasks. The default is `stdin:*`.
        If only one of 'src' or 'dst' is specified, a heuristic is used to
        determine whether a list of tasks or an input file was meant. (e.g
        `--input=file.in` will use `file.in` as input for all tasks, and
        `--input=0` will send stdin to task 0 only.

I think the mustache templates were thought to be pretty nice!

Source

garlick

Most helpful comment

For the current milestone, could we delay per-task support and just support per job output files, and rework the current per-task PR with a version that supports full mustache templates?

I was working under this understanding. That we were only trying to get single file stdout/stderr redirection working.

chu11 on 26 Sep 2019

👍2

All 12 comments

Brainstormed thoughts on how to approach some of these modes.

First I assumed that we do want to keep basic premise of adding an output service (proposed in #2201, WIP PR #2208) in the job-shell.

if we wanted to add additional output modes, such as output to files after the job is done, output to files locally but don't permanently store, store output to KVS, drop all output, etc. adding these to the output service would be a good way to abstract the ultimate output away from the job-shell.
- There would presumably be corner cases to handle in the output service that I have not yet thought through. Such as, what if one task hangs/doesn't complete. How do we notify the service to output everything it currently has stored up? OR should the output service dump data periodically? This is TBD.
if we wanted to add a mechanism to read past / future output, it would also be convenient to go through the output service. flux job attach could attach to the service directly to get all output from the job so far, or see all the future output coming. The output of just stdout/stderr or just the output of a specific task would be easy to implement as well.

Issues that we need to determine:

Since the output service will use a service name based on the jobid, it's easy to connect to that service name.
But how would the front end tool (flux job attach) know which broker the service is on to communicate with? This currently isn't known. Perhaps the information could be stored in the job eventlog? So if this information isn't in the job eventlog, we know the job hasn't started, so flux job attach can wait for the job to start (or error, could be either).
How would the front end tool know if there is any output to be read, at all? For example, if the user chose to drop all output in the task, how could flux job attach know this information and appropriately wait, error, or handle the situation. Would the information be in the jobspec? (Get via KVS lookup). Or would the information be in the job eventlog? Or would we simply have the output service return an appropriate errno indicating error?
Assuming we write some information to the job eventlog, we need to determine which job eventlog to write this information to: the main job eventlog or the guest one. The attach tool would also have to determine which eventlog to look at, the main job eventlog or the guest one.
- Presently, the job-info module only looks at the main job eventlog. Issue #2105 brings up updating job-info module to read either eventlog. This adds complexity of reading from the guest namespace and it may exist, may not yet exist, could disappear while attempting to read, etc. But these circumstances should be manageable by reading the current job state.

Thinking about these issues, I think the most important task is to determine how flux job attach would communicate with the service. By eventlog or otherwise. That would lead to us having to possibly have to solve #2105 first. Once that is done, we can probably do a minimal libiobuf service #2201/#2208 that replicates what is currently there, before moving onto the more advanced I/O modes we care about. Perhaps --labelio is the trivial first one we do.

chu11 on 3 Aug 2019

First, I assumed that we do want to keep basic premise of adding an output service (proposed in #2201, WIP PR #2208) in the job-shell

The _service_ may be too specialized to the shell to abstract in a library economically. For example, the shell can log to stderr and exit, while a library needs interfaces to let the caller make such decisions. The corner cases you mentioned would likely be more straightforward to handle in the shell rather than in a library. Given how simple shell/io.c turned out to be, it feels to me like we need to rethink the library approach.

That said, front end commands and the shell should probably share some code, for example the encoding/decoding of I/O (flux-framework/rfc#192).

There is a design card on the exec project board due 8/16. I suggest the flux-core team convene sometime soon at coffee time to discuss the important questions you are raising here and get a sketch of the design agreed upon.

Anyway, thanks for raising these questions as I think they are pertinent to the design either way!

garlick on 3 Aug 2019

👍1

Let's refocus this issue on _output_ modes, since that's somewhat separable from input modes. We can cover input in #2257.

garlick on 21 Aug 2019

in order for the shell to support several of these output modes, job submission information has to be passed from the user to the shell. The best place for this appears to be the jobspec.

As an initial implementation, lets pass such options in attributes.system.exec.shell.options.

It's also not clear to me how the average user will specify these options at the moment (via flux-srun? via flux-jobspec?). Perhaps via the eventual newer flux-srun (#2150) where the user can pass options through onto the shell?

chu11 on 16 Sep 2019

We have to decide if we want full command line compatibility with Slurm srun for flux jobspec srun. If we do, then we'll need a flux jobspec srun -o, --output option which supports the Slurm IO redirection "filename pattern", (See IO Redirection section of srun(1) manpage)

My guess is that we don't want to support the full slurm filename patterns, but do we want to save the -o, --output option for later, in case at some point flux jobspec srun does have to emulate srun?

If we're willing to break compatibility, one idea would be to add an -O, --output option which takes some sort of output filename template (it wouldn't have to be mustache).

As part of supporting shell MPI and other plugins, we'll need a way to set other generic options in the shell.options object, and I was hoping to have a -o, --option=SPEC option available to do that, but that does break the srun interface (-o is --output in srun.)

grondo on 16 Sep 2019

About to add output options to flux mini. The shell options proposed in #2395 and #2396 are:

attributes.system.shell.options.output.<stream>.type
attributes.system.shell.options.output.<stream>.path
attributes.system.shell.options.output.<stream>.label

Values for _type_: (string) "kvs", "file", or "per-task"
Values for _path_: (string) a UNIX path, w/ optional embedded "{{id}}" mustache
Values for _label_: (boolean) true or false

The plan was to create --output and --error options similar to wreckrun's. It's straightforward to set _type_="file" and _path_ based on these options.

How should the user select between _type_="file" and _type_="per-task"? Do we need --output-mode and --error-mode options?

For _label_, I assume that --label would set _label_=true for streams with _type_="file".

garlick on 26 Sep 2019

I think the per-task "type" is going to go away once we have real templating. The shell will have to figure out if the output file is per-task, per-shell, or per-job based on the template. Therefore, my opinion is that we shouldn't add any new options to flux mini just yet so we won't have to change them later.

grondo on 26 Sep 2019

Values for type: (string) "kvs", "file", or "per-task"

Yup

For label, I assume that --label would set label=true for streams with type="file".

Yup

Edit (i messed up cut & paste)

Values for path: (string) a UNIX path, w/ optional embedded "{{id}}" mustache
Values for label: (boolean) true or false

both yup

chu11 on 26 Sep 2019

we shouldn't add any new options to flux mini just yet so we won't have to change them later

Are we ok with -o stdout.type=per-task being the only way to activate per-task I/O for now, or would it be better to have the output plugin support "{{taskid}}" now and drop the _type_ shell options?

garlick on 26 Sep 2019

It would be better if we could support arbitrary templates now and drop the type=per-task. For the current milestone, could we delay per-task support and just support per job output files, and rework the current per-task PR with a version that supports full mustache templates?

-o stdout.type=per-task

Does this work now? That's cool!!

grondo on 26 Sep 2019

For the current milestone, could we delay per-task support and just support per job output files, and rework the current per-task PR with a version that supports full mustache templates?

I was working under this understanding. That we were only trying to get single file stdout/stderr redirection working.

chu11 on 26 Sep 2019

👍2

I peeled off #2422 for per-task output so we can close this one.

garlick on 30 Sep 2019

Was this page helpful?

0 / 5 - 0 ratings

Related issues

config: separately configure broker bind/config URIs

garlick · 3Comments

wreck parity: need support for job names

SteVwonder · 7Comments

cleanup: use __func__ not __FUNCTION__

garlick · 4Comments

docker: start munge daemons by default

SteVwonder · 7Comments

config: support config reload

garlick · 3Comments