Zero-to-jupyterhub-k8s: Experiment and add pointers to storage options (NFS Ganesha, Rook)

Created on 15 May 2018 · 10Comments · Source: jupyterhub/zero-to-jupyterhub-k8s

It would be great to integrate a storage solution in the z2jh helm chart that does not lock in a certain cloud provider. The Rook project could allow us to do this!

Note that a very common challenge is to have storage for users that can be read/write-able by multiple users at the same time (ReadWriteMany), NFS could support that but perhaps CephFS implemented by Rook efficiently on the cluster without a cloud vendor lock in would be better?

Relevant presentations on KubeCon 2018

Recent relevant presentation on KubeCon 2018 December

/cc: @yuvipanda :heart: for introducing me to NFS options like Rook on Gitter before.

architecture

Source

consideRatio

👍1

Most helpful comment

Advantage of NFS is it's relatively easy to setup, it's good enough for user storage, and you only need a single volume:

https://github.com/helm/charts/tree/master/stable/nfs-server-provisioner is a NFS server that dynamically creates shares (subdirectories) on a file system, e.g. a persistent disk.
https://github.com/helm/charts/tree/master/stable/nfs-client-provisioner requires an existing NFS server, and dynamically creates shares on it. I haven't used it but it looks like it just creates a subdirectory on the NFS share, then directs the pod to mount the NFS server directly: https://github.com/kubernetes-incubator/external-storage/blob/6e706ffe737c8cec49701c5771aa0aee0d04c9d3/nfs-client/cmd/nfs-client-provisioner/provisioner.go#L69

Main disadvantage of dynamic NFS provisioners is it's not recommended for sqlite so the Hub needs some other storage.

Another factor is whether you care about backups and data recovery. If you do you'll need to know what filesystems need to be backed up. For NFS it's relatively easy since everything is in one volume as long as you've got a copy of the Kubernetes PV/PVC objects. For distributed filesystems it's complicated.

manics on 29 Jan 2019

👍3

All 10 comments

Thanks for the links, @consideRatio!

I think we should definitely expand our storage section to include options to Rook / NFS. I would prefer links to guides for Rook rather than having that content inline - supporting and running any Storage Solution is Extremely Serious Business that you shouldn't do unless you absolutely have to.

CephFS is Ceph + user space NFS (NFS Ganesha) so it isn't that much different than running NFS Ganesha on top of cloud provider storage - which I would recommend for its simplicity.

yuvipanda on 16 May 2018

👍1

@yuvipanda ah excellent input, I'm really happy about being able to draw from your experience! I'm aiming to put in a lot of learning effort on this.

yj6nt

consideRatio on 16 May 2018

I am +1 to adding links to Z2JH!

choldgraf on 16 May 2018

@consideRatio Rook is definitely an interesting project, which I plan to eventually try out. Having said that, I would suggest to be careful with the idea of using Rook as a backend for Z2JH's persistent storage. Here are my rationale and related thoughts (take them with a grain of salt):

Unlike Kubernetes, which is clearly standard _de facto_, Rook is not a mainstream project
Even when/if Rook becomes mainstream (or, at least, popular enough), focusing on a single persistence project would, in my opinion, introduce a vendor lock-in type of project and deployment dependence
NFS support in Rook is a work in progress in the beginning phase and with no clear target time frame
Rook's current focus on Ceph as a storage backend introduces dependence on Ceph, which would result in increasing minimal and recommended resource requirements (that is, multiple nodes are needed) for Z2JH deployments; similar concern applies to CephFS- and GlusterFS-based K8s solutions
TL;DR: I suggest using K8s' NFS ReadWriteMany solution as standard Z2JH approach, while providing instructions on using other relevant solutions (CephFS, GlusterFS) for enabling Z2JH persistent storage

ablekh on 28 May 2018

I think this is why it's a good idea to link out to those docs / projects, rather than to "officially" support them with our own instructions. We can give people general tips, but leave it at that.

choldgraf on 9 Jun 2018

I've learned some more now. I still lack a lot of knowledge about NFS solutions.

About storage types

User storage should not be provided by _object storage_, but could be provided by _block storage_ (ReadWriteOnce) for private user storage or _file system storage_ (ReadWriteMany) for shared user storage.

About Rook

Rook can provide us with a block storage _storageclass_, but does not yet support allowing us to utilize PVs referenced by PVCs as underlying storage.
- About block storage: https://rook.github.io/docs/rook/v0.9/ceph-block.html
- About the failure to use dynamically provisioned storage as backend: https://github.com/rook/rook/issues/2107
- QUESTION: Is it a viable to utilize GKEs PDs by using a deamonset with affinities to mount storage to hostPath (like I think @yuvipanda does here), and then let Rook run with hard affinity alongside the daemonset pods, and consume the mounted directory by the daemonset?
Rook can provide us with a file system storage storageclass supporting the access mode ReadWriteMany, but it isn't convenient.
- About file system storage: https://rook.github.io/docs/rook/v0.9/ceph-filesystem.html
- About the inconvenience: https://github.com/rook/rook/blob/master/design/dynamic-provision-filesystem.md
Rooks use of _erasure core storage_ can allow you to reduce the required amount of storage to 1.5x or 2x of the actually used storage (from 3x I think), but would also degrade performance and add complexity to the setup.
- About erasure coded storage: https://rook.github.io/docs/rook/v0.9/ceph-block.html (see topic in article)

About NFS

NFS is older but tested tech, most people seem to move away from it if possible.
- GitLab for example: https://gitlab.com/gitlab-org/gitaly/issues/1397
The following Helm chart exists to setup NFS from scratch.
- NFS provisioner Helm chart: https://github.com/helm/charts/tree/master/stable/nfs-server-provisioner
- WARNING: I don't think it supports multiple replicas to be run, as I think it would create new decoupled NFS servers but still use the same kubernetes Service resource to access them. My theory is that you could end up with different storage on restarts of a pod consuming it. This means that this solution will rely on a single pod to stay running as compared to ROOKs solution as far as I understand it.
GCP has managed NFS:
- About: https://cloud.google.com/filestore/
- Semi-outdated how to use with GKE: https://rimusz.net/how-to-use-google-cloud-filestore-with-gke/
  - NOTE: Use this chart instead of the one recommended in guide, I think: https://github.com/helm/charts/tree/master/stable/nfs-client-provisioner
- QUESTION: Can you increase the storage requested by the NFS instance (1 TB -> 2 TB for example)?
- QUESTION: If 100 users requests 10 GB in a PVC, will you run out of storage on the 1 TB fileshare instance?
- QUESTION: Can you increase the used desired storage through PVC?
What to utilize?
How to do it?
- https://github.com/jupyterhub/zero-to-jupyterhub-k8s/issues/421

consideRatio on 26 Jan 2019

In reality, I investigate all this because I'm struggling to grasp if the following pain point...

When your utilization % (% of total users active at any time) is very low, causing you to spend more on storage than compute.

... could be resolved by:

Rook's block storage...
Rook's shared file system storage...
Google's Filestore + nfs-client-provisioner...
@yuvipanda's nfs-flex-volume along with something more...
Google Filestore, ubuntu boot image, init containers - and this solution here by @yuvipanda: https://github.com/pangeo-data/dev.pangeo.io-deploy/issues/25

My single requirement is to only consume 1.23 GB of persistent storage for a user that has only written 1.23 GB of stuff to storage.

consideRatio on 26 Jan 2019

Advantage of NFS is it's relatively easy to setup, it's good enough for user storage, and you only need a single volume:

https://github.com/helm/charts/tree/master/stable/nfs-server-provisioner is a NFS server that dynamically creates shares (subdirectories) on a file system, e.g. a persistent disk.
https://github.com/helm/charts/tree/master/stable/nfs-client-provisioner requires an existing NFS server, and dynamically creates shares on it. I haven't used it but it looks like it just creates a subdirectory on the NFS share, then directs the pod to mount the NFS server directly: https://github.com/kubernetes-incubator/external-storage/blob/6e706ffe737c8cec49701c5771aa0aee0d04c9d3/nfs-client/cmd/nfs-client-provisioner/provisioner.go#L69

Main disadvantage of dynamic NFS provisioners is it's not recommended for sqlite so the Hub needs some other storage.

manics on 29 Jan 2019

👍3

Rook has progressed since this issue was first opened, I think it is not possible to use with PVCs and dynamic provisioning etc, which was one hurdle for me when I considered the use of Rook.

https://github.com/rook/rook/issues/2107

consideRatio on 17 Oct 2019

Closing this in favor of a summary referencing this issue.

consideRatio on 5 Oct 2020

Was this page helpful?

0 / 5 - 0 ratings