After reviewing the various ideas brought up here (948, 828, 805) we came up with a slightly different approach for a slightly different problem (ECS containers) and would love feedback on this idea.
When a container is started on a host, some process would talk to Vault, validate against AWS that it was just started by ECS and then feed the secrets it needs into the container. All of this is done over the wire and therefore nothing is stored.
Every ECS host has 2 containers that always run: ecs-agent and SAM. SAM is linked to the docker socket and reads the Docker events stream. Whenever a container starts, it jumps into action to help manage secrets.
WIS is an entry point for all containers. When a container starts, WIS starts a webserver waiting for exactly one HTTP POST. If it doesnt get this POST in 30sec, it exits (and the container goes with it). If it gets any other HTTP request, it exits. When it receives the HTTP POST its waiting for, the payload being a JSON object of key/value pairs, it changes the k/v pairs into environment variables and then runs the CMD directive with the environment set.
VB validates that a given ECS task ARN has launched in the last X seconds and is reported as state=="RUNNING". VB should really be a Vault backend, but since we're a bunch of Ruby guys it's currently a simple Ruby script that acts as an HTTP proxy to Vault.
docker inspect of this new container (its in the Docker labels).We dont expect this to be the most incredibly secure method of using Vault. But it gets the job done for now. A couple problems areas:
Each tool is under 150 lines of code (with heavy comments) as we tried to keep it very maintainable.
We'd love feedback from Hashicorp and others about this approach.
Hi @natefox ,
You have some very interesting timing, because we were just preparing to post our initial PR for our AWS auth backend as it's feature complete and we're just working on testing (and in fact it's now up in #1300). There are dissimilarities between this and #1300 since that's basically designed to address the asks in #805, #828, and #948), and the methodology wouldn't work for ECS -- mostly because Amazon can't be relied upon as a trusted third party since it doesn't sign the ECS metadata. It may be interesting for you to look it over though, keeping in mind that it's not yet fully tested and documented, and we have integration testing to do as well.
I noticed you linked to the cubbyhole document and while you don't use cubbyhole I can see similar thinking here -- limited use/limited-ttl tokens to fetch secrets, and using coprocesses to manage the secure introduction. What you came up with actually rather closely matches something I designed at my previous job with our security architect, with an important difference (detailed below).
I do know that [some company with major AWS experience] is planning on posting a blog about the method they designed for performing secure introduction with ECS and Vault. I don't know when that will land but when I see it I will try to ensure that I link to it from here. It's very different in that it's much more AWS-specific using more AWS-specific technologies, for better or for worse. I unfortunately can't divulge details ahead of time.
Regarding your specific implementation, I have an overall large comment:
As you noted, you have a trust on first use problem. In #1300 this is mitigated by the fact that you need to actually be able to fetch the instance metadata from that instance but here you can get the information on the new container from a different container -- meaning, _any_ container can get that information. My suggestion is to flip the logic a bit: rather than have SAM act as the intermediary and retrieve the token and then submit to WIS, I think you'd be better off having SAM simply inform VB that a new container has been started, and give it the relevant info. Then, have VB connect to a service (a listening port in SAM, a modified WIS, or something else) and give the token to a service _in the container_.
The really nice thing about doing it this way is that rather than trust that SAM is really SAM and giving it a token, it doesn't really matter if SAM isn't who it says it is -- if the info you're given checks out (by verifying the container is running and was started < 30 seconds ago) then that container should be given credentials anyways. The worst thing a fake SAM can do is poke you to send a Vault token to a valid container. You can also keep a whitelist of containers that you've already sent a token to to make sure that it only happens once, no matter how many times SAM (or a fake SAM) pokes you.
You can still have SAM be the service that actually grabs the token and then use it to grab secrets and inject via WIS (or combine the two), but with the key being that VB is what actually initiates this connection. Bonus points for signing the message carrying the token so that the service in the container can validate it -- although if it connects to the right Vault server and the token is invalid it'll figure that part out soon enough.
If you wanted, this service that receives the connection (SAM or something else) could in fact use cubbyhole, or something similar, to keep a permanent token in memory to inject new credentials into applications as needed.
Another comment is regarding one of your drawbacks:
VaultBuddy basically needs a root token - or at least enough to read any/all secrets and generate tokens for these secrets. This would be mitigated if it were written as a Vault backend.
You may want to look into token roles, which are new in 0.5.2 (docs at https://www.vaultproject.io/docs/auth/token.html). This lets VaultBuddy have access to create tokens with policies that are _not_ subsets of its own token's policies. It's designed for exactly these types of situations.
As far as a built-in Vault backend goes, we've had some internal discussions around ECS but I don't think we're quite there yet. We've seen some various approaches to ECS auth workflows (including this and very similar ones like the one I worked on at my previous job and made suggestions about above) but they all require more coordination than would be necessary if ECS supported signed metadata and have associated other drawbacks.
At the moment we and some customers are talking to AWS about possibilities to enhance the ECS metadata and API to hopefully overcome some of these issues. Based on how those go and the likelihood/timeline of any changes we'll continually be reevaluating our plans, so you never know what the future holds!
hey folks,
I'm looking at this and wondering what it would take to make the leap to making it generic for docker, rather than tied to ECS.
I'm spitballing here a bit:
at that point you can use a WIS like process to retrieve the token, call vault to retrieve secrets, and do whatever you want.
Taking into account @jefferai comments, the main difference in this flow is:
Notable issues:
content trust: This is the branded name of the docker OSS notary service, but I"m not sure notary is being used by any of the other registry providers such as quay.io, google, aws, etc. There are workarounds (e.g. manually create and manage image signatures), but they all require a bit of work and aren't genericWhile how we use the token is probably dependent upon our need, I think the need to provide each container instance with its own scoped token in as secure a manner as possible is a common need.
Hi @skippy ,
Indeed, something like #1300 that works for Docker (generically) is on our roadmap!
thanks @jefferai
@natefox
I've created docker vault bridge that i'm running on AWS ECS. It requires to start docker-vault container with (wrapped) token on ECS hosts.
Hope it helps:
https://hub.docker.com/r/eskey/dockervault/
@jefferai, sounds like you guys are working on this; do you have a timeframe for when we can expect it?
We'd really like docker level authentication as well (unfortunately, the dockervault solution above has some security concerns - see https://raesene.github.io/blog/2016/03/06/The-Dangers-Of-Docker.sock/ and other resources - mounting the docker socket is just not a viable option).
Currently, we need to give all of our containers access to the union of all the secrets that any of our containers need, which is also pretty sub-optimal.
@jefferai Hi! Do you have any updates regarding this issue?
Maybe someone know any guide that describes ECS & Vault integration?
@StyleT This will end up being handled by the IAM support being merged into the aws (aws-ec2) auth backend.
@jefferai Am I right that I can track #2441 PR?
Yes.
Closing since that's the right place to watch.
Most helpful comment
@jefferai, sounds like you guys are working on this; do you have a timeframe for when we can expect it?
We'd really like docker level authentication as well (unfortunately, the dockervault solution above has some security concerns - see https://raesene.github.io/blog/2016/03/06/The-Dangers-Of-Docker.sock/ and other resources - mounting the docker socket is just not a viable option).
Currently, we need to give all of our containers access to the union of all the secrets that any of our containers need, which is also pretty sub-optimal.