Pipeline: Feature Request: Pooled PersistentVolumeClaims

Created on 20 Oct 2020  路  6Comments  路  Source: tektoncd/pipeline

Feature request

There should be a way to select a PersistentVolumeClaim from a pool as the workspace binding when creating PipelineRuns and TaskRuns.

Ideally, there should be a way to dynamically grow the pool size; if there are no PVC's available in the pool, a new one gets created dynamically and added to the pool. This implies that there should be some way to expire these PVC's as well.

Use case

There's a couple use-cases that I can think of:

  1. Running multiple PipelineRuns in parallel for the same Pipeline, where each PipelineRun receives a volume that persists across runs (like a cache). Today, this is not possible unless mounted using ReadWriteMany or ReadOnlyMany (assuming the storage backend supports it), or if a separate entity outside of Tekton manages the pooling.
  2. Eliminating the number of volumes being created and destroyed. Today, a common way to use volumes is to use a volumeClaimTemplate, which will create a PVC at the start of the run, and delete the PVC then the pod gets deleted. Using pooled PersistentVolumeClaims solves the following problems:

    • I might want to keep around my runs for historical purposes, but I don't want to keep the PVC around because it takes up space.

    • Creating/deleting a PVC incurs extra load on the Kubernetes API and the storage backend. I can reduce this load by simply re-attaching an existing PVC.

kinfeature

Most helpful comment

I guess you could do some really clever re-use of PVCs but that is essentially re-doing what a storage provisioner does. It might even be possible to write a storage provisioner that re-uses another storage provisioner's PVs (or underlying storage) but I do not know of active work in that area but that could be cool ;)

In our world we solved the problem a little differently. We found that in our provider managed clusters that the PVs allocated when using the "default" storage-class (and all the other storage-classes) were ridiculously too slow and expensive. They're generally designed for 500G+ of storage and double digit IOPs and can take minutes to allocate. Being cheap and wanting good performance we wrote a "local" provisioner that does pseudo-dynamic provisioning. Our integration to use it as the backing storage for workspaces is a bit messy but some of the work that @jlpettersson did really helps. Maybe this would too -- https://github.com/tektoncd/pipeline/issues/2595#issuecomment-702331103

A few weeks back I wondered aloud if maybe Tekton could optionally package a basic storage provisioner like ours (e.g. new experimental project) as otherwise using "workspaces" is painful/expensive, but never took it further.

All 6 comments

Worth noting that this feature request was also recently opened against Argo: https://github.com/argoproj/argo/issues/4130

I've tried searching for "kubernetes pvc pool" and "kubernetes storage pool" but haven't found anything. I wonder if this would also be worth looking at as a platform feature and raising with the k8s team.

I've tried searching for "kubernetes pvc pool" and "kubernetes storage pool" but haven't found anything. I wonder if this would also be worth looking at as a platform feature and raising with the k8s team.

That makes sense. Perhaps the ideal solution would look something like Kubernetes having a new Resource representing a pooled PVC, and there being a workspace binding to the PipelineRun. Something like the following:

---
apiVersion: v1
kind: PooledPersistentVolumeClaim
metadata:
  name: my-pvc-pool
spec:
  accessMode:
    - ReadWriteOnce
  resources:
    requests:
      storage: 8Gi
---
apiVersion: tekton.dev/v1beta1
kind: PipelineRun
  spec:
    pipelineRef:
      name: my-pipeline-run
    workspaces:
      - name: my-workspace
         pooledPersistentVolumeClaim:
           claimName: my-pvc-pool

Without native k8s support for this, perhaps the other approaches here are:

  1. Introduce a new CRD managed by Tekton representing a pooled PVC
  2. Support custom workspace bindings (similar to the custom task design) so Tekton users can bring their own pooled PVC provider

Curious about @skaegi's thoughts per our Slack conversation in https://tektoncd.slack.com/archives/CLCCEBUMU/p1603199756169700?thread_ts=1603139560.165600&cid=CLCCEBUMU

Support custom workspace bindings (similar to the custom task design) so Tekton users can bring their own pooled PVC provider

This would be desirable for other use cases as well. We've had requests to support more Workspace types for example, and this could be one way to do that. It could also open the door to Workspace types that aren't Volume-backed, such as using GCS / S3 buckets instead.

You generally would pool the backing storage for PVs and then write a storage provisioner to create the PVs dynamically.

I agree with this. (from the slack thread).

Storage pooling for PVCs is what cloud providers already do for this, in the layer under PVC/PV.

Storage pooling for PVCs is what cloud providers already do for this, in the layer under PVC/PV.

I'm definitely no expert of PV's. By the layer under PVC/PV's, are you referring to storage classes? And if so, how would this work? Are they able to provision stateful volumes? E.g. can I request a new PV that retains its file system from the last time I used it?

I guess you could do some really clever re-use of PVCs but that is essentially re-doing what a storage provisioner does. It might even be possible to write a storage provisioner that re-uses another storage provisioner's PVs (or underlying storage) but I do not know of active work in that area but that could be cool ;)

In our world we solved the problem a little differently. We found that in our provider managed clusters that the PVs allocated when using the "default" storage-class (and all the other storage-classes) were ridiculously too slow and expensive. They're generally designed for 500G+ of storage and double digit IOPs and can take minutes to allocate. Being cheap and wanting good performance we wrote a "local" provisioner that does pseudo-dynamic provisioning. Our integration to use it as the backing storage for workspaces is a bit messy but some of the work that @jlpettersson did really helps. Maybe this would too -- https://github.com/tektoncd/pipeline/issues/2595#issuecomment-702331103

A few weeks back I wondered aloud if maybe Tekton could optionally package a basic storage provisioner like ours (e.g. new experimental project) as otherwise using "workspaces" is painful/expensive, but never took it further.

Was this page helpful?
0 / 5 - 0 ratings