Test-infra: Redesign Prow auth strategy

Created on 19 Aug 2019  路  13Comments  路  Source: kubernetes/test-infra

What would you like to be added:
I would like an audit of the current x509 client certificate authentication strategy used to authenticate/authorize Prow deployments. Additionally, I would like to encourage a proposal for improvements/alternatives to maximize the security of multi-cluster Prow instances.

Possible alternatives/improvements:

Why is this needed:
Currently, Prow relies on the use of a x509 client certificate with super-user privileges for cluster authentication/authorization. There are several drawbacks to this approach:

  • Certificates cannot be revoked (kubernetes/kubernetes#60917)
  • Authorization roles are essentially global and thus cannot be tweaked at the service/node level.
  • Unless setup with near expiry and explicit rotation, certificates are long-lived and increase the risk of exposure.
Additional reading:

https://kubernetes.io/docs/reference/access-authn-authz/authentication/
https://kubernetes.io/docs/reference/access-authn-authz/authorization/

Acceptance Criteria

  • [x] halt efforts to facilitate current certificate-based process.
  • [x] finish the unfinished work to remove the legacy client, #13980.
  • [x] implement a more secure auth strategy for multicluster (e.g. service account token), #14163.
  • [x] ~mkbuild-cluster use SAR to verify that credentials can be used to run Prow #11011~ Generate credentials that can be used to run Prow, #14721
areprow kinfeature

Most helpful comment

Side question: is the service account secret truly a temporary access token?

To the extent that the token can be revoked and re-minted at any point, it is not permanent. However, it is not temporary in that a service account token does not have a built-in TTL. Please see the documentation for more information: https://kubernetes.io/docs/reference/access-authn-authz/service-accounts-admin/#token-controller

All 13 comments

/area prow

/cc @fejta

As soon as we clean up our messy kubernetes client situation and all clients know how to read a kubeconfig we should 100% always use scoped service account tokens for multi-cluster deployments.

/cc @alvaroaleman @cjwagner @fejta @stevekuznetsov
If Kubernetes service account tokens is the recommended approach, it would be great to devise a checklist of what is needed to get there so we can all work concurrently on the implementation.

The only component still using the legacy client is plank. The original migration to the upstream clients was in these two PRs: https://github.com/kubernetes/test-infra/pull/11068 https://github.com/kubernetes/test-infra/pull/10987

Those were reverted for unrelated reasons in https://github.com/kubernetes/test-infra/pull/11436

The sum total of work involved as I understand it is:

  1. re-revert those changes to introduce them again
  2. ensure the the kubeconfig loading logic appropriately loads service account tokens

2. ensure the the kubeconfig loading logic appropriately loads service account tokens

How do we do this in a cluster provider agnostic way? (e.g. without using something like a gcloud config config-helper --format=json command)

How do we do this in a cluster provider agnostic way? (e.g. without using something like a gcloud config config-helper --format=json command)

Via $anything that can read a Kubernetes secret on the one side and by just using a standard kubeconfig with a token in it on the other side.

See this as a sample for generating the SA and using its token via bash

How do we do this in a cluster provider agnostic way? (e.g. without using something like a gcloud config config-helper --format=json command)

Via $anything that can read a Kubernetes secret on the one side and by just using a standard kubeconfig with a token in it on the other side.

See this as a sample for generating the SA and using its token via bash

I don't follow the example. Why would the service account secret for a service account in cluster A be respected by cluster B? How does cluster B know anything about the service account in A? (Side question: is the service account secret truly a temporary access token?)

How do we do this in a cluster provider agnostic way?

My comment for 2 is more a reflection of the ... tenuous nature ... of the code that handles "default cluster" "service cluster" in Prow, not a comment about service accounts or loading kubeconfig in general. We must ensure that the code we have written and the flags we have chosen to expose are reasonable and working as expected.

(e.g. without using something like a gcloud config config-helper --format=json command)

If prow.k8s.io wants to tie Prow deployment credentials to Google authentication, you must build images with the tools necessary to use those Google credentials. I don't suggest that. We should just use a service account token. There's no use to tying it to some external identity.

Via $anything that can read a Kubernetes secret on the one side and by just using a standard kubeconfig with a token in it on the other side.

OpenShift has built (guess who the author was) a oc service-account get-token command to automate what Alvaro pasted, adding in watches to make sure it works across a token provisioning period by the service account minter.

I don't follow the example. Why would the service account secret for a service account in cluster A be respected by cluster B? How does cluster B know anything about the service account in A? (Side question: is the service account secret truly a temporary access token?)

I think there is a misunderstanding here. The process is:

  1. in a build cluster:
    a. create a service account
    b. create a set of roles and bindings to scope the access for that service account (can re-use half of the plank ones, for instance)
    c. read the secret that the minter has given to the service account
  2. in the service cluster:
    a. create a kubeconfig with the cluster/context/token for the service account above
    b. mount that into the service pods

Side question: is the service account secret truly a temporary access token?

To the extent that the token can be revoked and re-minted at any point, it is not permanent. However, it is not temporary in that a service account token does not have a built-in TTL. Please see the documentation for more information: https://kubernetes.io/docs/reference/access-authn-authz/service-accounts-admin/#token-controller

Just to comment that before anyone starts working on this, IMHO it would be best if we omitted using the generated clientset for Plank and instead directly use the controller-runtime client. It has the huge advantage that there is a lister-backed implementation of it, which allows us to make Plank use listers instead of "list the word every 30s" without doing any further code change other than changing the client.

This is already being used by sinker: https://github.com/kubernetes/test-infra/pull/13647/commits/5b9dbde0568b63266e5b395f6f38f485139cd35e

@alvaroaleman - I did just put up the PR from @stevekuznetsov original PR(s) for using the k8s client in plank #13980. From a performance and latency standpoint I like the added benefit of using the controller-runtime client. It should be trivial to swap out clients and update usage after #13980 using your reference implementation in sinker.

/assign

Was this page helpful?
0 / 5 - 0 ratings