Pipeline: Feature Request: Task Output Type of `Param`

Created on 4 Sep 2019 · 15Comments · Source: tektoncd/pipeline

It would be eminently useful for Tasks to have an output type of 'param', which could then be passed to the next task in a PipelineRun as an input param.

While other people might envision other use cases, a single type of string should be fine. Other data could be put into string form via JSON or however, or Base64-encoded by the user. I'm also not sure what a reasonable size cap would be, although for the use cases I picture, 1kb should be reasonable?

design help wanted kinfeature maybe-next-milestone

Source

kbruner

👍9 ❤1

Most helpful comment

The only PipelineResource type that's really suitable for this use case would be a bucket. However, I have issues with how using a bucket for caching data between steps actually works.

Also note that our current use case is to use tekton to create and destroy temporary infrastructure based mainly on terraform. We're not doing git-based builds, so making caching work with a git repository somehow would be, well, a lot of work.

Even though I have defined an artifact bucket for the pipelines controller to use, work outputs don't get pushed to it unless I also create a storage PipelineResource. That makes no sense to me. If /workingdir/output could reliably (or at least as a configuration option) by saved to the artifact bucket and then pulled into to /workingdir for the next task, yes, that would solve 99% of use cases.
Using a storage PipelineResource as a cache between tasks in a PipelineRun has some other nuisances:
a. Unlike the artifact bucket, it only supports absolute paths, so it requires some magic in the tasks to make sure concurrent runs of the same pipeline don't clobber/mix up files.
b. It doesn't clean up interim files between runs. The tasks would have to implement the same PipelineRun/Task file structure for input and outputs that the artifact bucket already does.
Params can be plugged into multiple Task attributes, not just accessed after the actual container runtime environment has begun. That can't be done easily/at all if the value needed is in a bucket blob.

kbruner on 4 Sep 2019

👍3 ❤1

All 15 comments

In principle I agree that this could be useful but I can't think of concrete examples - could you describe a bit more the use case you have for this? What's the problem that you're trying to solve & in what way is it not served by, for example, PipelineResources?

sbwsg on 4 Sep 2019

The only PipelineResource type that's really suitable for this use case would be a bucket. However, I have issues with how using a bucket for caching data between steps actually works.

Even though I have defined an artifact bucket for the pipelines controller to use, work outputs don't get pushed to it unless I also create a storage PipelineResource. That makes no sense to me. If /workingdir/output could reliably (or at least as a configuration option) by saved to the artifact bucket and then pulled into to /workingdir for the next task, yes, that would solve 99% of use cases.
Using a storage PipelineResource as a cache between tasks in a PipelineRun has some other nuisances:
a. Unlike the artifact bucket, it only supports absolute paths, so it requires some magic in the tasks to make sure concurrent runs of the same pipeline don't clobber/mix up files.
b. It doesn't clean up interim files between runs. The tasks would have to implement the same PipelineRun/Task file structure for input and outputs that the artifact bucket already does.
Params can be plugged into multiple Task attributes, not just accessed after the actual container runtime environment has begun. That can't be done easily/at all if the value needed is in a bucket blob.

kbruner on 4 Sep 2019

👍3 ❤1

I think this makes a lot of sense and I've run into some similar issues myself, for example in the nightly release pipeline (adding in https://github.com/tektoncd/pipeline/pull/1274) I need to generate a version tag, then pass that between steps. Between steps it isnt so bad b/c I can write it to a file on disk and read it into an environment variable but:

it would be even harder between tasks
even between steps it would be good to have a better way to do it

bobcatfish on 6 Sep 2019

👍3

Another example where this might be useful is if your step/task posts a message to a forum and you get back the post URL, or if you upload a zip file somewhere and get back a generated URL.

Passing params between steps is tricky especially since the container is already started. I think the /workspace mechanism is probably as good as it gets.

Off the top of my head Task Output param values could come from either a steps stdout (actually not super easy to do) or again from a files contents in the /workspace. Even then this might be best to encapsulate in an output ParamPipelineResource.

skaegi on 6 Sep 2019

👍1

@kbruner looking in more detail at the use case you're describing, I'm wondering if the proposed FileSet resource might be a better match for some of your issues (e.g. easily sharing files between Tasks) than string based output parameters?

(I think we want output params anyway but wanted to point that out!)

bobcatfish on 7 Oct 2019

I've put together a proposal for adding output params! :tada:

bobcatfish on 7 Oct 2019

❤2 🎉1

Another example where this might be useful is if your step/task posts a message to a forum and you get back the post URL, or if you upload a zip file somewhere and get back a generated URL.

Passing params between steps is tricky especially since the container is already started. I think the /workspace mechanism is probably as good as it gets.

Off the top of my head Task Output param values could come from either a steps stdout (actually not super easy to do) or again from a files contents in the /workspace. Even then this might be best to encapsulate in an output ParamPipelineResource.

👍 , I was linked here from a Slack question and I'd like to provide a use case I think is covered by this proposal.

We're looking to use a Tekton Task for creating webhooks: so like you'd get back a forum post URL, we're getting back the webhook ID which I can parse out and return as an output).

Just being able to provide output as type string would be great, I don't want to have to deal with files for example, or PipelineResources of type bucket, git, image etc.

Specifically I expect to be able to have this, and then in our Go code where we create/use Tekton resources, we could inspect the outputs and do something else - like update a ConfigMap so users can delete named webhooks (where the name is matched to an ID).

spec:
  inputs:
    params:
    - name: Mode
...
  outputs:
    (params here? Or am I on the wrong proposal?)
    name: webhook-id
    description: "the created webhook ID"
    name: api-response-code
    description: "the API response code returned from the POST used to create the webhook, e.g. 204 if successful"
    name: api-response-message
    description: "the API response message returned from the POST used to create the webhook"

a-roberts on 8 Oct 2019

I have another use-case. I have two Tasks where the first collects 3rd party service endpoints and credentials and the second uses them. Passing by volume made the second Task's contract really hard to understand whereas passing by params was much clearer.

skaegi on 9 Oct 2019

I have another use-case. I have two Tasks where the first collects 3rd party service endpoints and credentials and the second uses them. Passing by volume made the second Task's contract really hard to understand whereas passing by params was much clearer.

I think even more sense to not use a volume if this are credentials or API keys to avoid writing to persistent disk (at rest) and just have the sensitive data in “memory” (ie apiserver, etcd) only (in transit).

After writing this sensitive data might make more sense to store them as secrets and have the task picked them up as secrets.

Well then would use the output param just to pass the location of the secret.

Maybe secrets gets into a different type of inputs and outputs, of just an attribute to them

csantanapr on 9 Oct 2019

Here's a quick syntax example I put together based on a variation I like from the above proposal. I've flattened inputs.params to params, outputs.params to results, and am using variable interpolation to extract results. I've also introduced pipeline results which I suspect we might also want to support.

apiVersion: tekton.dev/v1alpha1
kind: Task
metadata:
  name: add
spec:
  params:
    - name: first
      description: the first operand
    - name: second
      description: the second operand
  results:
    - name: sum
      description: the sum of the first and second operand
  steps:
    - name: add
      image: alpine
      env:
        - name: OP1
          value: $(params.first)
        - name: OP2
          value: $(params.second)        
      command: ["/bin/sh", "-c"]
      args:
        - echo $((${OP1}+${OP2})) | tee /workspace/results/sum;
---
apiVersion: tekton.dev/v1alpha1
kind: Pipeline
metadata:
  name: sum-three 
spec:
  params:
    - name: first
      description: the first operand
    - name: second
      description: the second operand
    - name: third
      description: the third operand      
  tasks:
    - name: first-add
      taskRef:
        name: add-task
      params:
        - name: first
          value: $(params.first)
        - name: second
          value: $(params.second)
    - name: second-add
      runAfter: [first-add]
      taskRef:
        name: add-task
      params:
        - name: first
          value: $(tasks.first-add.results.sum)
        - name: second
          value: $(params.third)
  results:
    - name: sum
      description: the sum of all three operands
      value: $(tasks.second-add.results.sum)

skaegi on 25 Oct 2019

👍1

I think @sbwsg and/or I are going to tackle this as part of #1673

bobcatfish on 20 Dec 2019

Tasks now include a TaskResult list, declaring the results that the Task can output.

TaskRuns now include a TaskRunResult list in their Status field, which ends up containing the results that were actually written out during execution. https://github.com/tektoncd/pipeline/pull/1921

The remaining piece of work here is being able to use variables to use results of one task of a pipeline in another task of the same pipeline.

I'm going to create a separate issue for the suggested design of Pipeline Results just to help keep the issue size digestible. This will allow us to close this issue as soon as the variable work is ready.

sbwsg on 27 Jan 2020

👍1

The remaining piece of work here is being able to use variables to use results of one task of a pipeline in another task of the same pipeline.

Nice, thank you!
I think it would be great if we were able to do point to results within the Task that defines them - if not yet possible:

apiVersion: tekton.dev/v1alpha1
kind: Task
metadata:
  name: print-date
  annotations:
    description: |
      A simple task that prints the date to make sure your cluster / Tekton is working properly.
spec:
  results:
    - name: current-date-unix-timestamp
      description: The current date in unix timestamp format
    - name: current-date-human-readable
      description: The current date in humand readable format
  steps:
    - name: print-date-unix-timestamp
      image: bash:latest
      script: |
        #!/usr/bin/env bash
        date +%s | tee $(results.current-date-unix-timestamp)
    - name: print-date-humman-readable
      image: bash:latest
      script: |
        #!/usr/bin/env bash
        date | tee $(results.current-date-human-readable)

afrittoli on 29 Jan 2020

Ah, great idea @afrittoli ! How would you feel about results.current-date-human-readable.path to keep it in line with the way resources and workspaces work currently?

sbwsg on 29 Jan 2020

👍1

OK, we've written up issues for the remaining features related to Task Results. I'm going to close this issue and push the rest of the work into those more granular tickets.