I was doing some research and found that Tsuru has a way to do persistent storage, but there's some manual work involved. You can configure Tsuru to create directories for each provisioned app (i.e. /apps-data/drupal), but those directories only exist on the host that the container is running on, so some users have just pointed Tsuru at /mnt/my-storage-service instead (usually NFS or GlusterFS), and then all of their Tsuru nodes have those files.
If Flynn could be configured to do the same thing and not worry about doing HA with uploaded files (just leave it to the cluster administrator), I'd be pretty excited. That way, I could use NFS or EFS or Ceph or Gluster or whatever - not bound to a specific technology means that I'd be able to use Flynn in more places (some orgs have their own distributed FS that they'd want to use instead of whatever Flynn would provide, and chances are, that FS would already be mounted on the Flynn nodes).
If we want to go really crazy with it, give me some way to execute a shell script when apps are created or destroyed. One of the organizations that I've done some work with before has a really big AFS deployment. They'd be able to use that functionality to create an AFS volume with a given FS quota, mount it in the right place on the Flynn nodes, and use that for the app, and then be able to clean it up after the app is destroyed. That's a bit outside the scope of my personal needs, but I could see it being useful.
@cweagans Flynn provides persistent storage via ZFS volumes, and ZFS also supports sharing via NFS, so there is an opportunity there for Flynn to natively support sharing persistent storage across the cluster without user intervention.
That being said we are aware that a lot of users will want to manage the sharing outside of Flynn, so we should also think of ways to make it easy to mount directories from external shares inside Flynn jobs.
I wasn't aware of that. How would I go about setting up a ZFS volume for my app?
:+1: This would be fantastic for a number of my use-cases.
In the same vein it would be good to support EBS and other persistent block storage systems. They differ slightly from NFS as they usually can't support concurrent access but have similar qualities in they are generally accessible from all hosts in the cluster and would need a mechanism of tracking that is distinctly separate from how ZFS volumes are currently handled.
Are able to advise if and when this might be implemented it's just I have an app that will need to use nfs and need to know if it will be possible anytime soon.
@dottodot Work on this is currently unscheduled.
OK so normally within nodejs you can save a file anywhere on the system, so for example I can do
fs.writeFile("/users/marcus/documents/test.txt", "Hey there!", function(err) {
if(err) {
return console.log(err);
}
console.log("The file was saved!");
});
and with express use static so I can then access the files e.g
app.use('/documents', express.static('/users/marcus/documents'));
However when trying the same thing on the cluster with a path such as /mnt/uploads i get no such file or directory so I'm assuming it's not possible to access the file system outside of the app.
You've mentioned ZFS supports sharing via NFS so is this something I can set up so my app can access the mounted drive or will it only be possible if you integrate it into Flynn. I can't really get my head around ZFS to understand what is possible.
@dottodot ZFS volumes are local to the host, so sharing files between hosts would require quite a bit of work. Our solution for shared cluster storage will be NFS mounts, but the best option is to use a blobstore like S3 to store your files, instead of relying on the local filesystem.
Yeah I would normally go down the S3 type of route but I have an old app that uses the local filesystem so would have been much easier to migrate to flynn had I been able to continue using the local filesystem.
Going to have to have a rethink.
I'm in the same situation and I'm waiting for https://github.com/flynn/flynn/issues/2438 which solves this as far as I can see.
@philiplb That project only covers a best effort attempt to reuse volumes for database appliances. It will provide no guarantees and does not do any cross-host replication.
Maybe this is any good?
http://www.xtreemfs.org/
http://www.gluster.org/
http://openafs.org/
http://ceph.com/
http://lustre.org/
As someone who has worked extensively with AFS, I can safely say that we do _not_ want that. Gluster or Ceph are reasonably good, but this is not the issue for discussing general Flynn-hosted distributed filesystems. This is specifically about adding NFS support, which would allow you to use Amazon EFS and the like.
Hi, what is the status of this feature? There is a WIP branch? I'd like to helping with some testing/coding.
That would be very helpful for me in a current deployment.
Instead of just NFS support would using portworx not be an idea? It would also give us cool features like snapshots, backups & encryption. I've used it before with Rancher and k8s and it requires etcd. It gives an easy to use docker volume driver (pxd) and adding the shared=true driver opt even allows to mount the volume multiple times.
I'm new to this project and love the simplicity. I too am missing the feature to have shared volumes between all the instances and I'd love to help. Reading through this repo and its issues, am I correct in understanding that we already run an etcd cluster? Could anyone point me to the code where we start running the docker containers and have the possibility specify a volume driver?
@iain17 Flynn does not run etcd as a system application, but you can deploy it using a Docker image. As for portworx, I suspect work would be needed on their end to integrate with the Flynn scheduler.
@lmars reading their docs I see they also document how this would be done without a scheduler (https://docs.portworx.com/scheduler/docker/install-standalone.html). Maybe I'm naive in thinking it would be as easy as telling the Flynn scheduler to run a portworx container as described in the link? Like how it is done right now with the system apps? I mean all they do on their end is mostly document how it could be done.
Anyways, it was just an idea. Could anyone point me to the code where we start running the docker containers and have the possibility specify a volume driver?
@iain17 Flynn does not run containers using the Docker daemon, it has its own container runtime and volume API so work would be needed to integrate portworx.
We ended up solving this for our use-case by handling remote storage in our app. Specifically for us we used scp. I would imagine that FUSE would be possible to get working?
@lmars this project might spark your interest as a solution for this issue?: https://github.com/openebs/openebs
as a workaround solution, would it be possible to use something like this https://github.com/s3fs-fuse/s3fs-fuse ?
i.e. mount a shared s3 bucket to a given path on all instances of an application within the cluster?
Flynn is unmaintained and our infrastructure will shut down on June 1, 2021. See the README for details.
Most helpful comment
Hi, what is the status of this feature? There is a WIP branch? I'd like to helping with some testing/coding.