It would be great to integrate a storage solution in the z2jh helm chart that does not lock in a certain cloud provider. The Rook project could allow us to do this!
Note that a very common challenge is to have storage for users that can be read/write-able by multiple users at the same time (ReadWriteMany), NFS could support that but perhaps CephFS implemented by Rook efficiently on the cluster without a cloud vendor lock in would be better?
/cc: @yuvipanda :heart: for introducing me to NFS options like Rook on Gitter before.
Thanks for the links, @consideRatio!
I think we should definitely expand our storage section to include options to Rook / NFS. I would prefer links to guides for Rook rather than having that content inline - supporting and running any Storage Solution is Extremely Serious Business that you shouldn't do unless you absolutely have to.
CephFS is Ceph + user space NFS (NFS Ganesha) so it isn't that much different than running NFS Ganesha on top of cloud provider storage - which I would recommend for its simplicity.
@yuvipanda ah excellent input, I'm really happy about being able to draw from your experience! I'm aiming to put in a lot of learning effort on this.

I am +1 to adding links to Z2JH!
@consideRatio Rook is definitely an interesting project, which I plan to eventually try out. Having said that, I would suggest to be careful with the idea of using Rook as a backend for Z2JH's persistent storage. Here are my rationale and related thoughts (take them with a grain of salt):
I think this is why it's a good idea to link out to those docs / projects, rather than to "officially" support them with our own instructions. We can give people general tips, but leave it at that.
I've learned some more now. I still lack a lot of knowledge about NFS solutions.
ReadWriteOnce) for private user storage or _file system storage_ (ReadWriteMany) for shared user storage.ReadWriteMany, but it isn't convenient.In reality, I investigate all this because I'm struggling to grasp if the following pain point...
- When your utilization % (% of total users active at any time) is very low, causing you to spend more on storage than compute.
... could be resolved by:
My single requirement is to only consume 1.23 GB of persistent storage for a user that has only written 1.23 GB of stuff to storage.
Advantage of NFS is it's relatively easy to setup, it's good enough for user storage, and you only need a single volume:
Main disadvantage of dynamic NFS provisioners is it's not recommended for sqlite so the Hub needs some other storage.
Another factor is whether you care about backups and data recovery. If you do you'll need to know what filesystems need to be backed up. For NFS it's relatively easy since everything is in one volume as long as you've got a copy of the Kubernetes PV/PVC objects. For distributed filesystems it's complicated.
Rook has progressed since this issue was first opened, I think it is not possible to use with PVCs and dynamic provisioning etc, which was one hurdle for me when I considered the use of Rook.
Closing this in favor of a summary referencing this issue.
Most helpful comment
Advantage of NFS is it's relatively easy to setup, it's good enough for user storage, and you only need a single volume:
Main disadvantage of dynamic NFS provisioners is it's not recommended for sqlite so the Hub needs some other storage.
Another factor is whether you care about backups and data recovery. If you do you'll need to know what filesystems need to be backed up. For NFS it's relatively easy since everything is in one volume as long as you've got a copy of the Kubernetes PV/PVC objects. For distributed filesystems it's complicated.