Current PR #600 as initial details. This discussion is to consider future enhancements.
@stepro Yes you are right that having access to Docker daemon (on the same machine) effectively makes you root and all those little tricks of security are basically for nothing. With this in mind, perhaps we should not be obsessing about this here when trying to dockerise the agent.
The "more serious subject" was meant to be some use case and design ideas, both for the agent and the dockerised version of it. Some of these may make life easier and better for both MSFT and customers.
These are just ideas, not specifically directed at this PR.
In two steps:
_It would be valuable for the main use-case of the dockerised agent to kick off build tasks in customer-provided images._
This is what we are doing at the moment for some builds and are going to move all our builds to this concept (at least those not requiring Windows).
All our agent does is runs docker run -i OUR_BUILD_IMAGE our_script_in_the_container. The agent maps the source directory into our build container as a volume.
Security is not something which is solved by this :(
I don't know if this would be in scope for the dockerise the agent piece of work. It does not need to be as long as it's "friendly" for this kind of scenario. It would require more than a bare docker image anyway -- you'd need a bunch of scripts etc. Not sure if it's possible to make it completely generic which would cater for all customers' needs.
Also note that the pre-built VSTS tasks as they currently stand would not run in this scenario. For that -- see Grand Design further.
Not really to do with this PR but nevertheless if anyone is interested.
Many use cases are hard with existing agent because of these three:
Worker.Listener and Worker are rolled into one.Worker is allowed to execute anything using same UID as Listener on the same "machine".The 搂1 presents the problem of "but do we need all combinations of different workers for all combinations of build tasks and target OS's that we have"? This can be addressed now with current agent by running actual builds in the customer-provided docker images. The agent needs to support this and customers would need to create their own docker images.
The 搂2 makes is very hard to scale, make things elastic and respond to loads, run on clusters, causes issue with updates, de-registration, etc.
The 搂3 causes all sorts of security issues like "how do we hide the secret VSTS tokens?".
If we assume we run everything on a cluster which spans our whole data centre (say Mesos, ACS, Docker swarm, whatever). We use Docker images for everything.
1. The Listener is standalone.
It is provided as a Docker image by MSFT.
It's the only thing which needs to register with VSTS using and secret tokens etc.
Its only job is to listen for requests from VSTS and kick off workers.
I understand there was an intention to move to this model with the VSTS agent.
2. The Worker is standalone.
Workers are provided as Docker images by MSFT.
When Listener gets a request to run a job, it does docker pull Microsoft/vsts-worker:required_version and starts it on the cluster _somewhere_.
The Worker container is given one-off keys to pull down source code and communicate with VSTS on build progress. This is pretty much what is happening in the current implementation.
The only job of the Worker is to kick off build tasks and report progress to VSTS. Note that the Worker does not run the tasks in its own container.
3. The customer-provided Task containers.
These days many people (us including) run (all) their builds in Docker containers: we build an image with all required dependencies and tools with required OS disto, map the source code into the running container as a volume, and kick off the build. Simple!
So the Task containers are the same _customer-provided Docker images_. The images to run are specified in the build definition in VSTS. The Worker uses build definition to kick off the containers as required.
The worker maps the source code into the the task container (just like we would do manually) and starts whatever script or program in that container as specified in the build definition.
Again, each Task container is executed _somewhere in the cluster_.
The Worker captures the stdout and stderr of the task container to report progress to VSTS, and uses its exit status to report success/failure of the task.
If the task containers are not given access to the Docker daemon, they can be completely sealed off and all tokens would be safe from the arbitrary build tasks.
Many existing tasks in VSTS (like kick of MSBuild) can probably be be pre-canned and provided by MSFT as Docker images too. So you wouldn't need to build any of your own often if all you use is pre-built tasks.
What this design solves: everything! :)
OK, there are _many, many_ things which this approach would address/solve/make possible. Some are:
This was meant to be a brief comment... Thanks for reading! All this is very different from the current design and implementation -- but with containers and clusters everywhere these days, might be of interest as a longer-term direction.
For reference, we are running a docker-based scheme in one of our clusters as a base here: developertown/vsts-agent, and then with technology-specific descendents, such as developertown/vsts-agent-nodejs.
This has worked OK for us, although it's been rocky across agent updates -- the auto-update feature in the vsts-agent causes some issues for the current design of those images. Otherwise, this is working fine for us. I'd love to see first-class support for working this way.
Also note that in the most recent 2.107.x+ agents, we don't auto update on restart. Only if you invoke from the web UI. So, shouldn't have a problem going forward.
@jasonvasquez , just yesterday we published an official set of images to Docker Hub at https://hub.docker.com/r/microsoft/vsts-agent, backed by a new GitHub repo at https://github.com/microsoft/vsts-agent-docker. We would love to hear any comments you have on the approaches we have taken there. Here's a brief summary.
There are three dimensions to the images we are providing:
@stepro This is awesome. I will give this a try for our VS Code Linux builds.
@stepro This is working beautifully for us. We now base our Dockerfile in yours and run the agents with the help of a few bash scripts.
Cool, glad it worked out for you!
A dockerized vsts-agent on Windows is also handy too.
Guys, it seems that you have removed the ubuntu-14.04-standard based images. We heavily depend on this for our VSCode build agents. Any chance you can keep publishing them?
Nevermind, I can pull it now, seems to have been an auth problem.
Interesting. I specifically removed those tags so there's no guarantee they will continue to work. Any chance you can upgrade to Ubuntu-16.04? We removed these because there were simply too many combinations of images and it would take an extremely long time to build them all each time any update was made to the repo.
I'm happy to assist in helping you to upgrade if that would be useful.
We still depend on it... last time we tried to move to 16.04, users who had 14.04 installed as their OS couldn't run Code anymore. That might have something to do with the libraries Code is linked against at build time of its native modules, but I'm not an expert...
Most helpful comment
A dockerized vsts-agent on Windows is also handy too.