Machine: Proposal: machine share

Created on 30 Dec 2014  Â·  68Comments  Â·  Source: docker/machine

machine share

Abstract

machine is shaping up to be an excellent tool for quickly and painlessly creating and managing Docker hosts across a variety of virtualization platforms and hosting providers. For a variety of reasons, syncing files between the computer where machine commands are executed and the machines themselves is desirable.

Motivation

Containers are a great tool and unit of deployment and machine makes creating a place to run your containers easier than ever. There are, however, many reasons why one would want to share files which are not pre-baked into the Docker image from their machine client computer to a machine host. Some examples are:

  1. The machine host is VM on the user's laptop and they are developing a web application inside of a container. They want to develop with the source code bind-mounted inside of a container, but still edit on their computer using Sublime etc.
  2. The user wants to spin up 10 hosts in the cloud and do some scientific computing on them using Docker. They have a container with, say, the Python libraries they need, but they also need to push their .csv files up to the hosts from their laptop. This user does not know about the implementation details of machine or how to find the SSH keys for the hosts etc.
  3. There is some artifact of a build run on a machine (e.g. one or many compiled binaries) and the user wants to retrieve that artifact from the remote.

Like the rest of machine, it would be preferable to have 80% of use cases where this sort of thing happens integrated seamlessly into the machine workflow, while still providing enough flexibility to users who fall into the 20% not covered by the most common use cases.

Interface

After thinking about the mechanics and UX of this, I think we should favor explicitness over implicitness and err on the side of _not_ creating accidental or implicit shares.

There are a few aspects to the story that should be considered.

Command Line

The syntax would be something like this:

$ pwd
/Users/nathanleclaire/website

$ machine ls
NAME           ACTIVE   DRIVER         STATE     URL
shareexample   *        digitalocean   Running   tcp://104.236.115.220:2376

$ machine ssh -c pwd
/root

$ machine share --driver rsync . /root/code
Sharing /Users/nathanleclaire/website from this computer to /root/code on host "shareexample"....

$ machine share ls
MACHINE      DRIVER SRC                           DEST
shareexample rsync  /Users/nathanleclaire/website /root/code

$ ls

$ echo foo >foo.txt

$ ls 
foo.txt

$ machine ssh -c "cat foo.txt"
cat: foo.txt: No such file or directory
FATA[0001] exit status 1

$ machine share push
[INFO] Pushing to remote...

$ machine ssh -c "cat foo.txt"
foo

$ machine share --driver scp / /root/client_home_dir 
ERR[0001] Sharing the home directory or folders outside of it is not allowed.  To override use --i-know-what-i-am-doing-i-swear

IMO we should forbid users from creating shares to or from outside of the home directory of the client _or_ the remote host. There's a strong argument that the home directory itself should be banned from sharing as well, to prevent accidental sharing of files which should be moved around carefully such as ~/.ssh and, of course, ~/.docker. Also, clients could share directories to multiple locations, but any shares which point to the same destination on the remote would be disallowed.

Totally open to feedback and changes on the UI, this is just what I've come up with so far.

Drivers

There is a huge variety of ways to get files from point A to point B and back, so I'd propose a standard interface that drivers have to implement to be recognized as an option for sharing (just like we have done with virtualization / cloud platforms). The default would be something like scp (since it is so simple and is pretty ubiquitous) and users would be able to manually specify one as well. Users could pick the driver that suits their needs. Additionally it would allow the machine team to start with a simpler core and move forward later e.g. just rsync, scp, and vboxsf could be the options in the v1.0 and then later other drivers could be added.

Some possible drivers: scp, vboxsf, fusionsf, rsync, sshfs, nfs, samba

This would be useful because different use cases call for different ways of moving files around. NFS might work well for development in a VM, but you might want rsync if you are pushing large files to the server frequently, and so on.

Part of the interface for a share driver would be some sort of IsContractFulfilled() method which returns a boolean that indicates if the "contract" necessary for the share to work is fulfilled by both the client and remote host. This would allow us to, for instance, check if rsync is installed on the client machine and the remote host, and refuse to create a share if that is not the case. Likewise, it would prevent users from trying to do something silly like using the vboxsf driver on a non-Virtualbox host.

Possible Issues

  • If machine moves to a client-server model, which seems favored at the moment, it introduces additional complications to the implementation of machine share. Namely, machine share would be all client-side logic, and would not be able to make the same assumptions it may make today e.g. that the SSH keys for the hosts are all lying around ready to be used on the client computer.
  • machine share push and machine share pull make a ton of sense for some drivers, such as rsync, but not so much sense for vboxsf or nfs which update automatically. What happens when a user does machine share push on such a driver? "ERR: This driver is bidirectional"?
  • This is likely to have a pretty big scope, so we might want to consider making it a separate tool or introducing some sort of model for extensions that could keep the core really lean.
  • How are symlinks handled? Permissions? What happens if you move, rename, or delete the shared directory on the client, or on the remote?

    Other Notes

I am not attached to the proposal, I am just looking for feedback and to see if anyone else thinks this would be useful. I have put together some code which defines a tentative ShareDriver interface but it is not really to my liking (and no _real_ sync functionality is implemented) so I haven't shared it.

kinenhancement

All 68 comments

@nathanleclaire thanks for the proposal -- well written :)

A couple of initial thoughts: the idea of sharing I think makes sense for vbox (similar to what is recommended for fig) and we already do that for the vbox driver. However, outside of that, IMO they should be Docker images. I think it would be handy to be able to copy files (perhaps a machine scp or something) but the idea of pushing/syncing starts to enter into the system management area and I'm not sure we want to go there.

Are there other use cases you could think of besides app code?

To be clear, I'm not arguing against it, just trying to better understand the usage :)

+1 for machine scp

The basic reasoning that many users ave given is that when they are developing, they want the source code to be on their local machine, that way, when they blow away the remote instance, they don't have to worry about the files going away.

I recon this concern will exist for those of us that will use ephemeral cloud Docker daemons just the same as for b2d style ones.

so I'm very +1 to this - as its similar to what i started exploring in https://github.com/boot2docker/boot2docker-cli/pull/247

Perhaps local drivers could implement an arg for this. But cloud providers shouldn't have this.

scp and/or rsync would be good, same as current machine ssh.

@sthulb can you elaborate on why not? this will be a common question, and so will need to be in the docs (its not that hard to envision an opportunistic rsync that could allow the server to happily run while the clients are gone)

Local machines are geared for dev environments. I feel like remote (cloud) machines are for production purposes and thus, should have their data provisioned with data containers, or via chef/puppet/ansible/etc.

I can understand the requirement to upload a single file to a remote host though (various secrets).

I'm willing to sway if others feel like this is a thing we should have.

I have to agree with @sthulb. I think we should stay away from syncing/sharing in remote environments. I can see it being extremely powerful for local.

I use digital ocean as a dev environment. I also have 2 non-vm physical boot2docker servers that I use for development and testing, where my sshfs share driver for b2d-cli is very useful.

Perhaps this is a case for plugins.

@sthulb i think that's a brilliant idea

+1 - if you look at the PR I made in the b2d-cli repo, I copied the driver model, but starting machine with plugins would be nicer.

+1 for scp, and I agree with comments above, this proposal looks useful for development machines (Vbox, fusion), but on the other providers I think this could make quite a mess real fast.

+1 for scp, (perhaps sftp?)

+1 for local VM shares, NFS is a portable option if driver-specific shares are awkward. Vagrant does this kind of thing well :) I'd love to see it in this project.

+1 for scp

To give interested parties an update on where my head is at with this one: I'd like to make a machine scp command which is simply for moving files from point A to point B one way via scp (I think I want a --delta flag for this too to optionally shell out to rsync instead of scp).

Then, for more complicated use case e.g. NFS/Samba, there would be a separate (still docker-machine-centric) tool entirely as scope of managing this could get quite large.

I'm starting to think we should have a git style interface, with executables starting docker-machine-foo, in this case foo would be share.

This would be great for extending machine.

Just tried docker-machine v0.1.0 for the first time today. I have a workflow involving live updates to a host directory being propagated to a mounted volume within a container. I'm using the standard docker CLI arguments to achieve this.

  • this works fine if I use docker-machine create -d virtualbox dev (unsurprising since this builds upon the boot2docker work)
  • docker-machine create -d vmwarefusion dev results in an empty volume mount

I'm very impressed with docker-machine so far. My project's unit tests (ones unrelated to the mounted volume) passed just as they used to with boot2docker. Besides this issue, this was an extremely seamless transition from boot2docker to docker-machine.

@jokeyrhyme the issue with fusion is known and is being worked on. Thanks for the great feedback!

+1 for scp

@jcheroske It's in master now!! https://github.com/docker/machine/pull/1140

That's incredible. Noob question: Is there an osx binary of the latest and
greatest? My binary doesn't have it yet. I did figure out, digging around
in the .docker dir, how the certs are stored. Then I configured my
.ssh/config so that I could use regular ssh and scp. But having it built-in
is much better.

On Tue, May 19, 2015 at 1:05 PM, Nathan LeClaire [email protected]
wrote:

@jcheroske It's in master now!! #1140

—
Reply to this email directly or view it on GitHub.

@jcheroske You can always get the latest (master) binaries from here: https://docker-machine-builds.evanhazlett.com/latest/ :)

Are there any updates on this?

Windows, rsync and the fswatch problems: In order to reduce hooping through github issues, i reference explicitly:
https://github.com/boot2docker/boot2docker-cli/issues/66#issuecomment-120675231

In a nutshell, for serious development you need the inotify (or corresponding) functionality to work. fswatch implements an intersting library, but which does not support Windows' FileSystemWatch API. A good starting point on this issue is https://github.com/thekid/inotify-win/, but: thekid/inotify-win#7

I've been tearing my hair out trying to get volumes to work with docker-compose (and docker-machine).

My goal is to have an automated way to spin up website servers (for use with horizontal scaling). Docker seems a good tool for this. At first I just cloned the web code from github in the Dockerfile so copies of the web code lived in each Docker image / container. This seemed to work fine. The trouble was how to pull in web code updates so all the web servers would update? I could rebuild the docker image and then re-push that, but as pointed out in this point, that's very slow. Seems much better to just mount a volume with the web code so that changes in web code don't require rebuilding the entire docker image.

http://blog.ionic.io/docker-hot-code-deploys/

So my question is: Is this even possible with docker-compose? Is docker-machine required as well?
I seem to be running in to this problem: I simply cannot seem to successfully mount a volume.

http://stackoverflow.com/questions/30040708/how-to-mount-local-volumens-in-docker-machine

If this is the wrong issue / git repo for this question please point me to the proper location.

Thanks so much!

The trouble was how to pull in web code updates so all the web servers would update? I could rebuild the docker image and then re-push that, but as pointed out in this point, that's very slow.

Well, for production, mounting your code in a volume is really not very proper. Re-building an image from cache and pushing to / pulling from a collocated private registry, while not as fast as rsync, should not be too bad. Mounting a source code volume into the container is dangerous for a variety of reasons. You no longer have reproducibility of your filesystem, which is one of Docker's strongest guarantees, and you open yourself up to a whole class of headaches such as permissions issues (files are owned by different users in the container and on the host), versioning issues (which version of the app is deployed again?), and security issues (if someone were to sneak a file into your volume it'd appear on the host in addition to the container).

I see, thanks so much for the prompt and detailed reply @nathanleclaire
I guess my question is thus: what is the point of volumes? For development (non-production) only?
Your points make sense and I had the same thoughts myself when transitioning from the working non-volume solution to trying to add a volume - in some ways adding the volume seemed to defeat the purpose of Docker in the first place. It seems maybe that is the case?

in some ways adding the volume seemed to defeat the purpose of Docker in the first place.

In a lot of cases, yes.

There are three main use cases for volumes:

  1. Reading/writing to AUFS / filesystem of choice is too slow for e.g. databases, so you can bypass the CoW filesystem by specifying -v /containerpath.
  2. You want to share files between the container and host, e.g. to build an artifact in a container and spit it out into the host filesystem, or to share a database backup/dump.
  3. Sharing directories between containers using --volumes-from.

A lot of people use volumes in development for "hot reloading" of code, since the build-restart cycle is too long, and arguably it's less bad there if you bake the code straight in from CI => staging => production.

Got it, thanks @nathanleclaire
I switched back to no longer using volumes and things seem to be working!

:+1:

+1

+1

+1

About this use-case:

  1. The machine host is VM on the user's laptop and they are developing a web application inside of a container. They want to develop with the source code bind-mounted inside of a container, but still edit on their computer using Sublime etc.

I think it will be better (from UX point of view) to just fix Machine to work with volumes like users expect - use volumes like there is no Machine, just host (Win or Mac) and containers. Good reference what people expect is there: boot2docker/boot2docker#581

+1 to this. Has any thought been given to a bidirectional tool like unison?

Any plans for releasing this???

+2

IMHO a simple volume driver as discussed in the linked issue would be just enough. No need for an extra command from a functional perspective: https://github.com/docker/docker/issues/18246

Maybe some sugar around docker volume drivers (say chose the mechanism which fits the os) is on another page and could be an interesting feature for 0-time dev kickstart buzzword1234.

I definitely think the future looks a lot more optimistic for managing this outside of Machine given that Docker volume plugins are now available.

@nathanleclaire and which plugins, for example?

I (simply?) need to mount in my docker-machine or container the folder where Dockerfile is. Just in development time. Just on my development machines (Windows, Mac OSX, Linux). How to?

In production Docker is perfect.

How to accomplish this?

@ginolon I use Docker Compose to define whole stack. It also has nice feature - you can put relative path in volumes binging. (relative to docker-compose.yml)

@iBobik Wow! Like Vagrant? Really?

@ginolon I don’t know how it works in Vagrant, but in Compose it is like
this:

web:
  image: php:7
  ports:
    - "8012:80"
  volumes:
    - ./web-sources:/var/www/html

It will bind to directory web-sources what is in the same location like
this docker-compose.yml.

Jan Pobořil

2015-12-01 2:09 GMT-06:00 ginolon [email protected]:

@iBobik https://github.com/iBobik Wow! Like Vagrant? Really?

—
Reply to this email directly or view it on GitHub
https://github.com/docker/machine/issues/179#issuecomment-160889549.

@iBobik I'm trying what you say. It works. But, there is a big BUT.

If I put in my "web-sources" folder a file (let's say "test.rb") with some text in it (like this: "duigyjkgsdkg") and then I run

docker run -itP docker_web bash

I find web-source folder, then in it I run

ruby test.rb

and it says to me:

test.rb:1:in `<main>': undefined local variable or method `duigyjkgsdkg' for main:Object (NameError)

Now I open my SublimeText in Windows, modify the test.rb file with this:

puts "Hello!"

but in bash in docker it doesn't update the file content. Already the same error like before:

test.rb:1:in `<main>': undefined local variable or method `duigyjkgsdkg' for main:Object (NameError)

How to do?

I need to use my host machine to work in development. Like this.

@ginolon You have to use Docker Compose utility to start it. Docker client
by it self don’t work with docker-compose.yml.

Jan Pobořil

2015-12-01 3:30 GMT-06:00 ginolon [email protected]:

@iBobik https://github.com/iBobik I'm trying what you say. It works.
But, there is a big BUT.

If I put in my "web-sources" folder a file (let's say "test.rb") with some
text in it (like this: "duigyjkgsdkg") and then I run

docker run -itP docker_web bash

I find web-source folder, then in it I run

ruby test.rb

and it says to me:

test.rb:1:in <main>': undefined local variable or methodduigyjkgsdkg' for main:Object (NameError)

Now I open my SublimeText in Windows, modify the test.rb file with this:

puts "Hello!"

but in bash in docker it doesn't update the file content. Already the same
error like before:

test.rb:1:in <main>': undefined local variable or methodduigyjkgsdkg' for main:Object (NameError)

How to do?

I need to use my host machine to work in development. Like this.

—
Reply to this email directly or view it on GitHub
https://github.com/docker/machine/issues/179#issuecomment-160907767.

@iBobik how to start with docker-compose?

If I write:

docker-compose run web bash

it says to me:

docker-compose run web bash
ERROR: Interactive mode is not yet supported on Windows.
Please pass the -d flag when using `docker-compose run`.

Docker is such a mess. I think is not good as it seems.

We are not in Docker Compose support forum, so please don’t discuss it there. People are subscribed to this issue because of other topic.

Read tutorial about Docker Compose on docker.com. It is great utility, you just using it wrong.


    1. 2015 v 4:36, ginolon [email protected]:

Docker is such a mess. I think is not good as it seems.

—
Reply to this email directly or view it on GitHub https://github.com/docker/machine/issues/179#issuecomment-160929345.

+1 +1 +1 +1 +1 +1 +1 +1 +1 +1 +1 +1 +1 +1

current virtualbox sharing is performance limited and has permissions issues with some containers (postgresql for example)

+1

+1

so the solution is what? what should i do in my mac?ヽ(゜Q。)ノ?

Please, write there only constructive comments. No „+1“, because it is send
to all subscribers and they will unsubscribe if it was too much.

If you want to support this issue, just subscribe it.

Jan Pobořil

2016-01-11 5:03 GMT+01:00 daben1990 [email protected]:

+1+1+1 ,expecting it for so long time.

—
Reply to this email directly or view it on GitHub
https://github.com/docker/machine/issues/179#issuecomment-170426671.

Running this command works fine for copying between development machines on the same network:

rsync -chavzP --stats other-computer.local:~/.docker/machine/ ~/.docker/machine/ --exclude machines/default --exclude cache

+1

Finally I found the right issue, @nathanleclaire I would bet CIFS/NFS docker volumes, as implemented by https://github.com/gondor/docker-volume-netshare would solve the problem immediately. However those would need to be integrated seamlessly into boot2docker. Would that be possible?

I think https://github.com/docker/docker/pull/20262 can be an advent to the final answer to this.
With this PR, in 1.11, sharing would become a question of exposing the right nfs/samba shares on the hosts. Or doesn't github.com/docker/docker/pkg/mount support nfs/samba yet?

I use something along this line to keep in sync:

fswatch -o -e '.git' . | while read num ; \
do \
     rsync -avzh -e "ssh -i /path/to/home/.docker/machine/machines/machine-name/id_rsa" --progress --exclude '.git' . ubuntu@$(docker-machine ip machine-name):/home/ubuntu; \
done

edit: typo

Very interested in this feature as well, signing up for notifications.

IMO - based on the thread, it sounds like the easiest and most useful implementation would be to use standard drivers in the image to create local-only volume mounts, and not implement the feature remotely (if you are - you're sort of doing it wrong, right?).

Value for myself would come in local, on-the-fly development to contents in the container without any sftp / scp / rsync steps involved (or scripts like the one that MBuffenoir has set up)

Same here. This is a must have.

@in10se sorry to say but Docker Machine seems to be a dead product. Development ans innovation has stagnated and I wouldn’t count on it for production environments. I think Hashicorp Terraform or similar tools are the way to go.

@erkie, it does look like activity has slowed down in the past few months. Hopefully it will resume. Thanks for the suggestion.

What is the state of this proposal?

Not much going on here.

Was this page helpful?
0 / 5 - 0 ratings