Linkerd2: Fail to inject when mTLS is enabled and there is no service account

Created on 23 May 2019  ยท  17Comments  ยท  Source: linkerd/linkerd2

Bug Report

What is the issue?

When linkerd injects it's proxy into a pod where service account automount is disabled, the pod fails to start up And goes into a restart loop. The root cause is the linkerd-proxy looking for the mounted serviceaccount info but not finding it.

How can it be reproduced?

Try to instrument a pod that has the following attribute set in it's spec:

automountServiceAccountToken: false

I encountered this issue with the official argocd manifest: https://argoproj.github.io/argo-cd/getting_started/#1-install-argo-cd

Logs, error output, etc

On pod startup the linkerd-proxy container outputs this:

time="2019-05-23T15:47:08Z" level=info msg="running version dev-undefined"
time="2019-05-23T15:47:08Z" level=info msg="Using with pre-existing key: /var/run/linkerd/identity/end-entity/key.p8"
time="2019-05-23T15:47:08Z" level=info msg="Using with pre-existing CSR: /var/run/linkerd/identity/end-entity/key.p8"
ERR! [     0.000310s] linkerd2_proxy::app::config Could not read LINKERD2_PROXY_IDENTITY_TOKEN_FILE: No such file or directory (os error 2)
ERR! [     0.000354s] linkerd2_proxy::app::config LINKERD2_PROXY_IDENTITY_TOKEN_FILE="/var/run/secrets/kubernetes.io/serviceaccount/token" is not valid: InvalidTokenSource
configuration error: InvalidEnvVar

linkerd check output

โžœ linkerd check
kubernetes-api
--------------
โˆš can initialize the client
โˆš can query the Kubernetes API

kubernetes-version
------------------
โˆš is running the minimum Kubernetes API version
โˆš is running the minimum kubectl version

linkerd-existence
-----------------
โˆš control plane namespace exists
โˆš controller pod is running
โˆš can initialize the client
โˆš can query the control plane API

linkerd-api
-----------
โˆš control plane pods are ready
โˆš control plane self-check
โˆš [kubernetes] control plane can talk to Kubernetes
โˆš [prometheus] control plane can talk to Prometheus
โˆš no invalid service profiles

linkerd-version
---------------
โˆš can determine the latest version
โˆš cli is up-to-date

control-plane-version
---------------------
โˆš control plane is up-to-date
โˆš control plane and cli versions match

Status check results are โˆš

Environment

  • Kubernetes Version:
โžœ kubectl version
Client Version: version.Info{Major:"1", Minor:"13", GitVersion:"v1.13.5", GitCommit:"2166946f41b36dea2c4626f90a77706f426cdea2", GitTreeState:"clean", BuildDate:"2019-03-25T15:26:52Z", GoVersion:"go1.11.5", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"14", GitVersion:"v1.14.2", GitCommit:"66049e3b21efe110454d67df4fa62b08ea79a19b", GitTreeState:"clean", BuildDate:"2019-05-16T16:14:56Z", GoVersion:"go1.12.5", Compiler:"gc", Platform:"linux/amd64"}
  • Cluster Environment: minikube in HyperKit
  • Host OS: MacOS
  • Linkerd version:
โžœ linkerd version
Client version: stable-2.3.0
Server version: stable-2.3.0

Possible solution

Allow the mTLS to work without a service-account.

areinject good first issue help wanted prioritP1

Most helpful comment

I just posted this in the linkerd slack channel, but here is a config that will allow linkerd 2.6 to inject with elastic-cloud-on-k8s, in case this helps anyone else seeing this issue.

apiVersion: elasticsearch.k8s.elastic.co/v1
kind: Elasticsearch
metadata:
  name: elasticsearch
spec:
  version: 7.5.2
  nodeSets:
  - name: default
    count: 1
    podTemplate:
      metadata:
        annotations:
          linkerd.io/inject: "enabled"
      spec:
        automountServiceAccountToken: true

All 17 comments

@csreegn The mTLS feature uses the service account to verify identity. If you feel that it's absolutely necessary for the argocd-repo-server workload to have automountServiceAccountToken: false, then you can use linkerd inject --disable-identity to disable mTLS for that particular workload. That way, you will still get all the other good stuff that comes with Linkerd, minus the mTLS identity feature.

Note that the disable-identity option isn't supported with auto-injection in stable-2.3.0. If you are using an auto-inject set-up, you can try one of the newer edge versions.

I honestly don't know if automountServiceAccountToken: false is needed or not, but I would imagine that from a security perspective there are always the more paranoid who don't want to mount the SA token if they don't use it. (I am not one of them, but it makes sense to limit the attack surface of a pod).

Also if I deploy a service that has the automountServiceAccountToken: false set, I think it would be acceptable for the pod injector to flip it to true if linkerd definitely needs the serviceaccount. Otherwise using linkerd is not transparent (eg. abstraction leakage). Right now if I don't explicitly set this flag to true, my pod won't even start, and I think that's a violation of the contract of linkerd to be transparent. Maybe a graceful degradation scenario would be acceptable? (don't provide mtls for the pod if no SA token is present, but don't fail the startup).

My temporary workaround is to set the flag to true, but my worry is that for common off the shelf (COTS) applications that are installed with helm, or the manifest is taken verbatim and applied (with kubectl apply -f), you will never know for sure if the app will start, because you will never know if the flag is set or not (or for helm it's also not sure if it's configurable or not).

btw, if there the Linkerd proxy is outputting errors in its logs, the linkerd check --proxy command should fail. What does the command output on your end?

I think there is a trade-off here between the amount of auto-mutation we want to introduce into the proxy injection process and the user's experience. Like you said, there are people who explicitly don't want to mount the SA token for security reasons. Personally, I will have problem with some intermediary flipping on a security flag (in this case, automountServiceAccountToken) that I explicitly disabled. That same goes for automatically disabling the Linkerd mTLS. My preference is if a new workload that I'm about to deploy doesn't comply with my cluster security settings, I will want the deployment to fail loudly.

linkerd check --proxy command does not finish, it idles waiting for proxies to be ready:

โžœ linkerd check --proxy
kubernetes-api
--------------
โˆš can initialize the client
โˆš can query the Kubernetes API

kubernetes-version
------------------
โˆš is running the minimum Kubernetes API version
โˆš is running the minimum kubectl version

linkerd-existence
-----------------
โˆš control plane namespace exists
โˆš controller pod is running
โˆš can initialize the client
โˆš can query the control plane API

linkerd-api
-----------
โˆš control plane pods are ready
โˆš control plane self-check
โˆš [kubernetes] control plane can talk to Kubernetes
โˆš [prometheus] control plane can talk to Prometheus
โˆš no invalid service profiles

linkerd-version
---------------
โˆš can determine the latest version
โˆš cli is up-to-date

linkerd-data-plane
------------------
โˆš data plane namespace exists
\ data plane proxies are ready -- waiting for check to complete

That doesn't really communicate what is wrong, or how to resolve it.

Yeah, with the service account automount being a crucial precondition for mTLS to work, we can be a bit smarter with this check. @olix0r proposed that we update the proxy injector to fail the proxy injection if mTLS is enabled, but the pod spec has automountServiceAccountToken: false.

Will you be interested in submitting a PR? :wink: I'll be happy to show you where the webhook code is. It should be an easy fix, but no pressure though.

Sure, I'll try to squeeze it in within a few weeks. Never coded in golang before, but there's a first time for everything!

What does the linkerd sidecar do with the service account token?

Linkerd sidecar injector should inject an audience bound token into the linkerd proxy container. This would provide better security (token bound to identity service would not be replayable against the Kubernetes API, and vice versa) and would also remove linkerd's dependency on automountServiceAccountToken. Doc here:

https://kubernetes.io/docs/tasks/configure-pod-container/configure-service-account/#service-account-token-volume-projection

And more details on the differences to legacy tokens here:

https://github.com/kubernetes/kubernetes/issues/70679#issue-377670415

That would be awesome!

In fact, the docs that @grampelberg linked to specifically call out time- and audience-bound SA tokens as future work once they're available "in future Kubernetes releases". @mikedanese I'm assuming this means they are ready for us to use in modern K8s versions?

trying to mesh: elasticsearch:7.4.1

โ”‚ time="2019-10-29T06:38:02Z" level=info msg="running version stable-2.6.0"
โ”‚ time="2019-10-29T06:38:02Z" level=info msg="Using with pre-existing key: /var/run/linkerd/identity/end-entity/key.p8"
โ”‚ time="2019-10-29T06:38:02Z" level=info msg="Using with pre-existing CSR: /var/run/linkerd/identity/end-entity/key.p8"
โ”‚ configuration error: InvalidEnvVar
โ”‚ ERR! [     0.002000s] linkerd2_proxy::app::config Could not read LINKERD2_PROXY_IDENTITY_TOKEN_FILE: No such file or directory (os error 2)
โ”‚ ERR! [     0.002021s] linkerd2_proxy::app::config LINKERD2_PROXY_IDENTITY_TOKEN_FILE="/var/run/secrets/kubernetes.io/serviceaccount/token" is not valid: InvalidTokenSource
โ”‚

@masterkain are you using an operator? It wouldn't be something related to the image you're using, instead how you're deploying it on the cluster.

@masterkain @grampelberg
For the cloud on k8s version of elastic installation at least the issue was also an explicit automountServiceAccountToken: false setting on the pod. Fixed in https://github.com/elastic/cloud-on-k8s/issues/1151

I just posted this in the linkerd slack channel, but here is a config that will allow linkerd 2.6 to inject with elastic-cloud-on-k8s, in case this helps anyone else seeing this issue.

apiVersion: elasticsearch.k8s.elastic.co/v1
kind: Elasticsearch
metadata:
  name: elasticsearch
spec:
  version: 7.5.2
  nodeSets:
  - name: default
    count: 1
    podTemplate:
      metadata:
        annotations:
          linkerd.io/inject: "enabled"
      spec:
        automountServiceAccountToken: true
Was this page helpful?
4 / 5 - 1 ratings

Related issues

wmorgan picture wmorgan  ยท  3Comments

coleca picture coleca  ยท  4Comments

skalinets picture skalinets  ยท  3Comments

steve-fraser picture steve-fraser  ยท  4Comments

klingerf picture klingerf  ยท  3Comments