Argo: Make aggregated output parameters available by name

Created on 8 Mar 2020  路  14Comments  路  Source: argoproj/argo

Summary

I'd like to be able to access aggregated output parameters filtered by name. Right now you can get _all_ the output parameters as JSON using {{steps.STEP-NAME.outputs.parameters}}. I'd like to add .PARAM-NAME to get JSON for _just_ output params by that name.

Motivation

Someone asked on StackOverflow how to access aggregated output parameters. They were attempting to access by name like {{steps.STEP-NAME.outputs.parameters.PARAM-NAME}}.

That seems intuitive/nice. It would be easy enough to filter using jq, but getting the results straight from Argo would be a better UX.



Message from the maintainers:

If you wish to see this enhancement implemented please add a 馃憤 reaction to this issue! We often sort issues this way to know what to prioritize.

enhancement good first issue help wanted

Most helpful comment

  1. I agree with @simster7 that dot-notation for property access should be consistent with everyones expectations. This rules out {{steps.write.outputs.parameters.number}} as an aggregate accessor.

What do you think about something like {{steps.write.outputs.parameters.number.list}} or {{steps.write.aggregated.outputs.parameters.number}}?

All 14 comments

It seems like this is already supported? From Argo Variables:

Steps Templates:

| Variable | Description|
|----------|------------|
...
| steps.<STEPNAME>.outputs.parameters.<NAME> | Output parameter of any previous step |
...

DAG Templates:

| Variable | Description|
|----------|------------|
...
| tasks.<TASKNAME>.outputs.parameters.<NAME> | Output parameter of any previous task |
...

An example: https://github.com/argoproj/argo/blob/master/examples/nested-workflow.yaml

Am I missing something here?

That's true for retrieving output parameters for steps without parallelism. For a looped step, it seems like steps.<TASKNAME>.outputs.parameters.<NAME> isn't defined.

Example from the StackOverflow post:

apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
  generateName: loops-param-result-
spec:
  entrypoint: loop-param-result-example
  templates:
  - name: loop-param-result-example
    steps:
    - - name: generate
        template: gen-number-list
    - - name: write
        template: output-number
        arguments:
          parameters:
          - name: number
            value: "{{item}}"
        withParam: "{{steps.generate.outputs.result}}"
    - - name: fan-in
        template: fan-in
        arguments:
          parameters:
          - name: numbers
            value: "{{steps.write.outputs.parameters.number}}"

  - name: gen-number-list
    script:
      image: python:alpine3.6
      command: [python]
      source: |
        import json
        import sys
        json.dump([i for i in range(20, 31)], sys.stdout)

  - name: output-number
    inputs:
      parameters:
      - name: number
    container:
      image: alpine:latest
      command: [sh, -c]
      args: ["echo {{inputs.parameters.number}} > /tmp/number.txt"]
    outputs:
      parameters:
        - name: number
          valueFrom:
            path: /tmp/number.txt

  - name: fan-in
    inputs:
      parameters:
        - name: numbers
    container:
      image: alpine:latest
      command: [sh, -c]
      args: ["echo received {{inputs.parameters.numbers}}"]

Hmm... that seems like a bug then. Will investigate later today

So in this example, your desired output for {{steps.write.outputs.parameters.number}} would be ["20", "21", ..., "30"]? @mac9416

Yep, I think that would do nicely! As long as you can't think of a reason this would be misleading or otherwise problematic for users.

It is a slight deviation from typical JSON referencing, but hopefully the user will understand they're not working directly with a JSON object here.

As long as you can't think of a reason this would be misleading or otherwise problematic for users.

This is my main concern. I think the syntax A.B is pretty consistent to "B is a subset of A" and implementing the name as proposed would mean something different. Can you think of a different syntax that would clearly say "B is aggregated from all members of A"? Or at least some syntax that wouldn't conflict with A.B?

(chatted a bit offline)

JavaScript: A.map(params => params.B) - clear but verbose - maybe could shorten to A.map(B)?
Python: [item.B for item in range(A)] - still verbose
jq: .A[] | .B - short but introduces | - maybe shorten to .A[].B? but then we're inventing stuff

Or maybe invent some function like getall(A, B).

Simon plans to bring it up with Jesse.

My feelings about this are:

  1. I agree with @simster7 that dot-notation for property access should be consistent with everyones expectations. This rules out {{steps.write.outputs.parameters.number}} as an aggregate accessor.
  2. We should not invent any new syntax.
  3. Argo Rollouts has already chosen to use a different library https://github.com/antonmedv/expr than workflows for expression processing, which has more powerful list/map/reduce processing capabilities than the one that workflows is currently using. I think the capability most relevant to this issue is one() function combined with a filter.
  4. Proper escaping must be supported for inevitability that a user wants literal values and not the expression library's syntax.

Sounds like a major refactor to use expr. Maybe the solution is to throw a lint error if someone tries to directly access a param for a parallelized step?

This would be helpful as well - I am having to sprinkle in jq wish Bash to be a workaround, which isn't ideal.

  1. I agree with @simster7 that dot-notation for property access should be consistent with everyones expectations. This rules out {{steps.write.outputs.parameters.number}} as an aggregate accessor.

What do you think about something like {{steps.write.outputs.parameters.number.list}} or {{steps.write.aggregated.outputs.parameters.number}}?

Would JSONPath-like {{steps.write.outputs.parameters.number[*]}} work? With * being the only supported value to clearly indicate that the entire array will be returned.

I would like to point out that the syntax should work for a task/step with or without loop:
example: a function foo( x ... string) can be invoked as:
foo("a") or foo("a", "b", "c") via variadic argument.
Likewise, maybe we should return a collection(list) type.
Without loop, it would be a collection of one item.

Was this page helpful?
0 / 5 - 0 ratings