Pipeline: Default workspace types for PipelineRuns and TaskRuns

Created on 11 May 2020  Â·  11Comments  Â·  Source: tektoncd/pipeline

Expected Behavior

Following up on https://github.com/tektoncd/pipeline/issues/2398

It would be useful to be able to specify default Workspace types. The idea would be that users don't need to specify workspaces they don't care about (or whose configuration should be entirely in the hands of admins) and Tekton could just inject a default bit of volume configuration when it launches a TaskRun or PipelineRun.

It seems like one approach to this would be to introduce a ConfigMap (maybe per-namespace for different tenants?) that allows a user to set the default WorkspaceBinding. It probably makes sense to have separate configuration for PipelineRun and TaskRun since TaskRuns can execute happily with an emptyDir but PipelineRuns would probably want the option to use something like a volumeClaimTemplate by default.

If we were to also introduce optional Workspaces then I'm not sure what the most desirable behaviour would be - would a default WorkspaceBinding be injected for optional Workspaces or would they be left empty?

Actual Behavior

At the moment users have to completely specify WorkspaceBindings in their Runs.

Also relates to the existing config-artifact-pvc which are used by PipelineResources.

kinfeature

All 11 comments

Here's an example ConfigMap syntax that might suit. TaskRuns would received emptyDir WorkspaceBindings by default and PipelineRuns would receive volumeClaimTemplates by default:

apiVersion: v1
kind: ConfigMap
metadata:
  name: workspace-defaults # Could be in our existing `config-defaults.yaml`
data:
  defaultTaskRunWorkspaceBinding: # Field Name TBD
    emptyDir: {}
  defaultPipelineRunWorkspaceBinding:
    volumeClaimTemplate:
      spec:
        accessModes:
          - ReadWriteOnce
        resources:
          requests:
            storage: 1Gi

And then, given a Task that looks just like one of our existing examples:

spec:
  workspaces:
  - name: custom
  - name: custom2
    mountPath: /foo/bar/baz
  - name: custom3
  - name: custom4
    mountPath: /baz/bar/quux
  - name: custom5
    mountPath: /my/secret/volume
  steps:
  - name: write
    image: ubuntu
    script: echo $(workspaces.custom.volume) > $(workspaces.custom.path)/foo
  - name: read
    image: ubuntu
    script: cat $(workspaces.custom.path)/foo | grep $(workspaces.custom.volume)
  - name: write2
    image: ubuntu
    script: echo $(workspaces.custom2.path) > $(workspaces.custom2.path)/foo
  - name: read2
    image: ubuntu
    script: cat $(workspaces.custom2.path)/foo | grep $(workspaces.custom2.path)
  - name: write3
    image: ubuntu
    script: echo $(workspaces.custom3.path) > $(workspaces.custom3.path)/foo
  - name: read3
    image: ubuntu
    script: cat $(workspaces.custom3.path)/foo | grep $(workspaces.custom3.path)
  - name: readconfigmap
    image: ubuntu
    script: cat $(workspaces.custom4.path)/my-message.txt | grep "hello world"
  - name: readsecret
    image: ubuntu
    script: |
      #!/usr/bin/env bash
      set -xe
      cat $(workspaces.custom5.path)/username | grep "user"
      cat $(workspaces.custom5.path)/message | grep "hello secret"

A user could choose to only provide the secret and configuration workspaces:

kind: TaskRun
spec:
  workspaces:
    - name: custom4
      configMap:
        name: my-configmap
        items:
          - key: message
            path: my-message.txt
    - name: custom5
      secret:
        secretName: my-secret

All of the other workspaces would be automatically set to emptyDir volumes.

This is also related to https://github.com/tektoncd/pipeline/issues/2586
since, you currently have to _carefully_ think how you use PVCs in a Pipeline at least when you have any _parallelism_ of tasks or if you have a _regional_ cluster.

Remember https://github.com/tektoncd/pipeline/pull/2546

fan-out: one task followed by multiple parallel - is difficult with PVC workspaces
fan-in: multiple parallel tasks to be converge in a single task - is difficult with PVC workspaces

Ah yeah, good point; it may be that default workspaces for PipelineRuns are much trickier to use correctly.

The parallelism issues are related to use of a PVC in parallel tasks, regardless of if it was provided via a default workspace mechanism or if user provided the PVC in the PipelineRun.

Default workspace is useful IMO for removing the burden or defining a workspace PVC for each PipelineRun. The parallelism issues need to get addresses separately from this and are not really the target of default workspaces to resolve.

removing the _burden_ or defining a workspace PVC for each PipelineRun

A realistic and proper PipelineRun will most likely be extensive (see example below), where "defining PVC workspace" is a _small_ but _declarative_ piece. This should be done _once_ in e.g. a TriggerTemplate and stored in _version control_ with _code reviewed_ PR for every change, as recommended in Accelerate and Continuous Delivery.

Kubernetes Declarative Config Management, similar to the _declarative_ format of Terraform is a _successfactor_. _Immutable Infrastructure_ and _Declarative Config_ is also a reason to interest of GitOps.

We should focus on _user experience_ but not give up _declarative config_ - it is a strength compared to previous CI-systems.


That said, I think the intent with this issue is to improve user experience when doing "explorative" work, e.g. as a pipeline-author. This is usually done using an _imperative workflow_ e.g. using kubectl imperative commands. But that doesn't necessarily mean that we need to change the declarative types - I actually think we have good declarative API for workspaces - almost idential to StatefulSet that have an almost identical problem - and the kubernetes community has probably thought much about it.

However, there are ways to improve _imperative workflow_ in other ways. E.g. in last API WG, there was discussions about using "generators" - that could be one way.

Also, an _imperative workflow_ should not be used in a _production environment_.


This is one example of a more realistic PipelineRun:

apiVersion: tekton.dev/v1beta1
kind: PipelineRun
metadata:
  generateName: continuous-delivery-pipeline-
spec:
  params:
  - name: git-url
    value: [email protected]:jlpettersson/myapp.git
  - name: branch-name
    value: master
  - name: committer
    value: jlpettersson
  pipelineRef:
    name: pipeline-with-git-clone
  taskRunSpecs:
  - pipelineTaskName: git-clone
    taskPodTemplate:
      volumes:
      - name: ssh-auth
        projected:
          defaultMode: 0400
          sources:
          - secret:
              name: github-known-hosts
          - secret:
              name: github-private-key
  - pipelineTaskName: build-and-push-image
    taskPodTemplate:
      serviceAccountName: image-builder
      volumes:
      - name: service-account-token
        projected:
          sources:
          - serviceAccountToken:
              path: token
              expirationSeconds: 7200
              audience: image-registry
  - pipelineTaskName: integration-test
    taskPodTemplate:
      serviceAccountName: integration-tester
      volumes:
      - name: service-account-token
        projected:
          sources:
          - serviceAccountToken:
              path: token
              expirationSeconds: 7200
  - pipelineTaskName: deploy-to-staging
    taskPodTemplate:
      serviceAccountName: deploy-staging
      volumes:
      - name: service-account-token
        projected:
          sources:
          - serviceAccountToken:
              path: token
              expirationSeconds: 7200
              audience: staging-cluster
  - pipelineTaskName: deploy-to-prod
    taskPodTemplate:
      serviceAccountName: deploy-prod
      imagePullSecret: prod-image-pull-secret
      volumes:
      - name: service-account-token
        projected:
          sources:
          - serviceAccountToken:
              path: token
              expirationSeconds: 7200
              audience: prod-cluster
  workspaces:
  - name: ws
    volumeClaimTemplate:
      spec:
        accessModes: ["ReadWriteOnce"]
        resources:
          requests:
            storage: 1Gi

Default workspace intention is go give admins a way to configure defaults. How is providing sensible defaults contrary to declarative config management?

Default workspace intention is go give admins a way to configure defaults. How is providing sensible defaults contrary to declarative config management?

My opinion is that this is the wrong problem to solve.

Default workspace intention is go give admins a way to configure defaults.

For storage, the common Kubernetes-way to do this is by configuring a default storageClass

How is providing sensible defaults contrary to declarative config management?

That is very _implicit_ where declarative config tend to be more _explicit_. In this case, if you try to hide PersistentVolumeClaim or not, it will create a PersistentVolumeClaim in the _namespace_ to the user - that is also affecting the namespace Storage Resource Quota and it may be limited by cluster admins with LimitRanges: Limiting Storage Consumption.

I feel that trying to hide a resource creation for the user here that will be created in a users namespace is not anything good. And declaring those few lines is definitely not a _burden_ - in the context of declarative config management.

I don't think the quota and limits play a role here given that it's pretty much how PipelineRun and TaskRun work :) When users create PipelineRuns and TaskRuns, a number of pods are created in the user namespace which count against the quota and limits. PVC won't be an exception

Reopening because we never did quite land on a decision wrt default workspaces in PipelineRuns

@sbwsg @jlpettersson How would you feel about a config-map provided "default" storage-class for PipelineRun Workspaces.

I largely agree that we don't want to or even have to hide the details of PVC creation. When running in my team's service the user runs in a namespace that uses quota. What we struggle with is the "default" storage-class as typically it's not at all suitable for Tekton use. (slow to provision, poor IOPs, not-free etc.). We end up forcibly over-riding pvc creation to use something better suited that we provide. (e.g. local topo-aware provider)

Concretely... the default-pipeline-run-storage-class would inject its value as a new storage-class field in any PipelineRun workspace volumeClaimTemplates that do not already provide one.

Concretely... the default-pipeline-run-storage-class would inject its value as a new storage-class field in any PipelineRun workspace volumeClaimTemplates that do not already provide one.

I like this idea and I'm now thinking we should do both this and a default-pipeline-run-workspace-binding.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

vdemeester picture vdemeester  Â·  3Comments

bobcatfish picture bobcatfish  Â·  4Comments

hrishin picture hrishin  Â·  3Comments

nader-ziada picture nader-ziada  Â·  4Comments

ImJasonH picture ImJasonH  Â·  4Comments