Linkerd2: Simplify certificate distribution for webhooks

Created on 16 Jan 2019  路  13Comments  路  Source: linkerd/linkerd2

Background

Proxy auto-inject is done by a mutating webhook and it is likely that Linkerd will add more webhooks in the future (for example, a validating webhook for service profile validation). The configuration and initialization of these webhooks is complicated because of the way that TLS credentials are distributed.

When the webhook is created, the webhook container posts a MutatingWebhookConfiguration object to the Kubernetes API. The Linkerd CA notices this and creates TLS credentials for the webhook service and distributes them to to the webhook as a secret. The webhook then waits until the secret is mounted and then uses the given credentials to serve over TLS.

Problem

This approach is complex and difficult to follow. We would like for the webhook to be able to serve TLS without all of this credential management. Ideally:

  • There would be little to no special case code in the CA for handling webhooks
  • There would be little to no code in the webhook for managing certificates
arecontroller areinject prioritP1

Most helpful comment

@adleong I鈥檇 suggest using ca.go (and putting it somewhere more generic) for the time being, unless you鈥檙e doing shell scripting.

All 13 comments

Would it make sense for the Linkerd proxy to be used in the webhook pod for terminating TLS?

Would it make sense for the CA itself to be a mutating webhook so that it can synchronously provision credentials for pods as they are created but before they are started?

I have another proposal for this: the linkerd CA should not be used _at all_ for provisioning webhook certs. Instead, webhooks should be responsible for generating their own self-signed certificates during installation or, preferably, at runtime.

My understanding is that webhooks post configuration to the Kubernetes API as the process starts, including the trust chain that Kubernetes should use to validate the webhook's certificate. If this is the case, then there's basically no value in using linkerd's CA, which is intended to generate specialized certificates for the service mesh (and communication between Kubernetes and these webhooks is explicitly _not_ meshed communication).

If this is the case, then the container that initializes the webhook's configuration can simply generate its own certificates using step(1) or openssl(1) directly. There's no need to persist any of the key material outside of the running container's filesystem, since it can just regenerate new credentials when it is restarted.

Does anyone see any obvious problems with this suggestion?

Based on my understanding, that sounds like a great solution!

@adleong @olix0r Having the proxy injector generate its own self-signed root and leaf certificates worked in my tests.
I used this to generate the root cert:

step certificate create linkerd-root-ca linkerd-root-ca.crt linkerd-root-ca.key --insecure --no-password --profile root-ca

And this to generate the leaf cert and private key:

step certificate create linkerd-proxy-injector.linkerd.svc linkerd.crt linkerd.key --insecure --no-password --ca linkerd-root-ca.crt --ca-key linkerd-root-ca.key --profile leaf

I then copied the contents of linkerd-root-ca.crt into webhook_config.go (the trust anchor that is sent in the MutatingWebhookConfiguration) and the contents of linerd.crt and linkerd.key into proxy injector TLS server, as seen in this diff:
https://gist.github.com/alpeb/fae1b5053a5b0e6cdc70ae2a516315b0
So with this approach we'd shell out a call to step to get those files (not done yet, 'cause we just wanted to verify the approach)

As an alternative, I pushed https://github.com/linkerd/linkerd2/pull/2163, which simply reuses our own ca.NewCA() instead of having to rely on step.

Let me know what you think.

Awesome! Re-using ca.NewCA() is a huge improvement/simplification over what we have today. That said, I believe that the plan will be to remove (or at least heavily change) that ca code. So I think that we'll want to depend on step here instead of the existing ca code. But it's great to have validated this approach.

@adleong as I commented in https://github.com/linkerd/linkerd2/pull/2163 , ca.go is pretty generic. The guy that's gonna need heavy refactoring is /controller/ca/controller.go in my opinion.

Ah, nice, I understand now. Are you leaning one way or the other as to whether we should use step or the existing generic code in ca.go?

@adleong I鈥檇 suggest using ca.go (and putting it somewhere more generic) for the time being, unless you鈥檙e doing shell scripting.

@alenkacz you mentioned you wanted to work on https://github.com/linkerd/linkerd2/issues/2075 after this one got addressed.
The admission webhooks should be much simpler to work on now. Feel free to ping me if you need any clarification.

great news @alpeb thanks! I am leaving for a vacation tomorrow and I'll be back in a week but I'll try to start today and finish this when I come back :)

I'll probably pick up the validation work. Enjoy your vacation, @alenkacz! 馃嵐

@adleong 馃槶馃槶 ... ok :)

Was this page helpful?
0 / 5 - 0 ratings

Related issues

steve-fraser picture steve-fraser  路  4Comments

geekmush picture geekmush  路  4Comments

olix0r picture olix0r  路  3Comments

briansmith picture briansmith  路  4Comments

ihcsim picture ihcsim  路  4Comments