There should be a way to select a PersistentVolumeClaim from a pool as the workspace binding when creating PipelineRuns and TaskRuns.
Ideally, there should be a way to dynamically grow the pool size; if there are no PVC's available in the pool, a new one gets created dynamically and added to the pool. This implies that there should be some way to expire these PVC's as well.
There's a couple use-cases that I can think of:
Worth noting that this feature request was also recently opened against Argo: https://github.com/argoproj/argo/issues/4130
I've tried searching for "kubernetes pvc pool" and "kubernetes storage pool" but haven't found anything. I wonder if this would also be worth looking at as a platform feature and raising with the k8s team.
I've tried searching for "kubernetes pvc pool" and "kubernetes storage pool" but haven't found anything. I wonder if this would also be worth looking at as a platform feature and raising with the k8s team.
That makes sense. Perhaps the ideal solution would look something like Kubernetes having a new Resource representing a pooled PVC, and there being a workspace binding to the PipelineRun. Something like the following:
---
apiVersion: v1
kind: PooledPersistentVolumeClaim
metadata:
name: my-pvc-pool
spec:
accessMode:
- ReadWriteOnce
resources:
requests:
storage: 8Gi
---
apiVersion: tekton.dev/v1beta1
kind: PipelineRun
spec:
pipelineRef:
name: my-pipeline-run
workspaces:
- name: my-workspace
pooledPersistentVolumeClaim:
claimName: my-pvc-pool
Without native k8s support for this, perhaps the other approaches here are:
Curious about @skaegi's thoughts per our Slack conversation in https://tektoncd.slack.com/archives/CLCCEBUMU/p1603199756169700?thread_ts=1603139560.165600&cid=CLCCEBUMU
Support custom workspace bindings (similar to the custom task design) so Tekton users can bring their own pooled PVC provider
This would be desirable for other use cases as well. We've had requests to support more Workspace types for example, and this could be one way to do that. It could also open the door to Workspace types that aren't Volume-backed, such as using GCS / S3 buckets instead.
You generally would pool the backing storage for PVs and then write a storage provisioner to create the PVs dynamically.
I agree with this. (from the slack thread).
Storage pooling for PVCs is what cloud providers already do for this, in the layer under PVC/PV.
Storage pooling for PVCs is what cloud providers already do for this, in the layer under PVC/PV.
I'm definitely no expert of PV's. By the layer under PVC/PV's, are you referring to storage classes? And if so, how would this work? Are they able to provision stateful volumes? E.g. can I request a new PV that retains its file system from the last time I used it?
I guess you could do some really clever re-use of PVCs but that is essentially re-doing what a storage provisioner does. It might even be possible to write a storage provisioner that re-uses another storage provisioner's PVs (or underlying storage) but I do not know of active work in that area but that could be cool ;)
In our world we solved the problem a little differently. We found that in our provider managed clusters that the PVs allocated when using the "default" storage-class (and all the other storage-classes) were ridiculously too slow and expensive. They're generally designed for 500G+ of storage and double digit IOPs and can take minutes to allocate. Being cheap and wanting good performance we wrote a "local" provisioner that does pseudo-dynamic provisioning. Our integration to use it as the backing storage for workspaces is a bit messy but some of the work that @jlpettersson did really helps. Maybe this would too -- https://github.com/tektoncd/pipeline/issues/2595#issuecomment-702331103
A few weeks back I wondered aloud if maybe Tekton could optionally package a basic storage provisioner like ours (e.g. new experimental project) as otherwise using "workspaces" is painful/expensive, but never took it further.
Most helpful comment
I guess you could do some really clever re-use of PVCs but that is essentially re-doing what a storage provisioner does. It might even be possible to write a storage provisioner that re-uses another storage provisioner's PVs (or underlying storage) but I do not know of active work in that area but that could be cool ;)
In our world we solved the problem a little differently. We found that in our provider managed clusters that the PVs allocated when using the "default" storage-class (and all the other storage-classes) were ridiculously too slow and expensive. They're generally designed for 500G+ of storage and double digit IOPs and can take minutes to allocate. Being cheap and wanting good performance we wrote a "local" provisioner that does pseudo-dynamic provisioning. Our integration to use it as the backing storage for workspaces is a bit messy but some of the work that @jlpettersson did really helps. Maybe this would too -- https://github.com/tektoncd/pipeline/issues/2595#issuecomment-702331103
A few weeks back I wondered aloud if maybe Tekton could optionally package a basic storage provisioner like ours (e.g. new experimental project) as otherwise using "workspaces" is painful/expensive, but never took it further.