Linkerd2: Fail to inject when mTLS is enabled and there is no service account

Created on 23 May 2019 · 17Comments · Source: linkerd/linkerd2

Bug Report

What is the issue?

When linkerd injects it's proxy into a pod where service account automount is disabled, the pod fails to start up And goes into a restart loop. The root cause is the linkerd-proxy looking for the mounted serviceaccount info but not finding it.

How can it be reproduced?

Try to instrument a pod that has the following attribute set in it's spec:

automountServiceAccountToken: false

I encountered this issue with the official argocd manifest: https://argoproj.github.io/argo-cd/getting_started/#1-install-argo-cd

Logs, error output, etc

On pod startup the linkerd-proxy container outputs this:

time="2019-05-23T15:47:08Z" level=info msg="running version dev-undefined"
time="2019-05-23T15:47:08Z" level=info msg="Using with pre-existing key: /var/run/linkerd/identity/end-entity/key.p8"
time="2019-05-23T15:47:08Z" level=info msg="Using with pre-existing CSR: /var/run/linkerd/identity/end-entity/key.p8"
ERR! [     0.000310s] linkerd2_proxy::app::config Could not read LINKERD2_PROXY_IDENTITY_TOKEN_FILE: No such file or directory (os error 2)
ERR! [     0.000354s] linkerd2_proxy::app::config LINKERD2_PROXY_IDENTITY_TOKEN_FILE="/var/run/secrets/kubernetes.io/serviceaccount/token" is not valid: InvalidTokenSource
configuration error: InvalidEnvVar

`linkerd check` output

➜ linkerd check
kubernetes-api
--------------
√ can initialize the client
√ can query the Kubernetes API

kubernetes-version
------------------
√ is running the minimum Kubernetes API version
√ is running the minimum kubectl version

linkerd-existence
-----------------
√ control plane namespace exists
√ controller pod is running
√ can initialize the client
√ can query the control plane API

linkerd-api
-----------
√ control plane pods are ready
√ control plane self-check
√ [kubernetes] control plane can talk to Kubernetes
√ [prometheus] control plane can talk to Prometheus
√ no invalid service profiles

linkerd-version
---------------
√ can determine the latest version
√ cli is up-to-date

control-plane-version
---------------------
√ control plane is up-to-date
√ control plane and cli versions match

Status check results are √

Environment

Kubernetes Version:

➜ kubectl version
Client Version: version.Info{Major:"1", Minor:"13", GitVersion:"v1.13.5", GitCommit:"2166946f41b36dea2c4626f90a77706f426cdea2", GitTreeState:"clean", BuildDate:"2019-03-25T15:26:52Z", GoVersion:"go1.11.5", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"14", GitVersion:"v1.14.2", GitCommit:"66049e3b21efe110454d67df4fa62b08ea79a19b", GitTreeState:"clean", BuildDate:"2019-05-16T16:14:56Z", GoVersion:"go1.12.5", Compiler:"gc", Platform:"linux/amd64"}

Cluster Environment: minikube in HyperKit
Host OS: MacOS
Linkerd version:

➜ linkerd version
Client version: stable-2.3.0
Server version: stable-2.3.0

Possible solution

Allow the mTLS to work without a service-account.

areinject good first issue help wanted prioritP1

Source

csreegn

👀2

Most helpful comment

I just posted this in the linkerd slack channel, but here is a config that will allow linkerd 2.6 to inject with elastic-cloud-on-k8s, in case this helps anyone else seeing this issue.

apiVersion: elasticsearch.k8s.elastic.co/v1
kind: Elasticsearch
metadata:
  name: elasticsearch
spec:
  version: 7.5.2
  nodeSets:
  - name: default
    count: 1
    podTemplate:
      metadata:
        annotations:
          linkerd.io/inject: "enabled"
      spec:
        automountServiceAccountToken: true

kellybirr on 29 Jan 2020

👍2

All 17 comments

@csreegn The mTLS feature uses the service account to verify identity. If you feel that it's absolutely necessary for the argocd-repo-server workload to have automountServiceAccountToken: false, then you can use linkerd inject --disable-identity to disable mTLS for that particular workload. That way, you will still get all the other good stuff that comes with Linkerd, minus the mTLS identity feature.

Note that the disable-identity option isn't supported with auto-injection in stable-2.3.0. If you are using an auto-inject set-up, you can try one of the newer edge versions.

ihcsim on 23 May 2019

I honestly don't know if automountServiceAccountToken: false is needed or not, but I would imagine that from a security perspective there are always the more paranoid who don't want to mount the SA token if they don't use it. (I am not one of them, but it makes sense to limit the attack surface of a pod).

csreegn on 24 May 2019

Also if I deploy a service that has the automountServiceAccountToken: false set, I think it would be acceptable for the pod injector to flip it to true if linkerd definitely needs the serviceaccount. Otherwise using linkerd is not transparent (eg. abstraction leakage). Right now if I don't explicitly set this flag to true, my pod won't even start, and I think that's a violation of the contract of linkerd to be transparent. Maybe a graceful degradation scenario would be acceptable? (don't provide mtls for the pod if no SA token is present, but don't fail the startup).

csreegn on 24 May 2019

My temporary workaround is to set the flag to true, but my worry is that for common off the shelf (COTS) applications that are installed with helm, or the manifest is taken verbatim and applied (with kubectl apply -f), you will never know for sure if the app will start, because you will never know if the flag is set or not (or for helm it's also not sure if it's configurable or not).

csreegn on 24 May 2019

btw, if there the Linkerd proxy is outputting errors in its logs, the linkerd check --proxy command should fail. What does the command output on your end?

I think there is a trade-off here between the amount of auto-mutation we want to introduce into the proxy injection process and the user's experience. Like you said, there are people who explicitly don't want to mount the SA token for security reasons. Personally, I will have problem with some intermediary flipping on a security flag (in this case, automountServiceAccountToken) that I explicitly disabled. That same goes for automatically disabling the Linkerd mTLS. My preference is if a new workload that I'm about to deploy doesn't comply with my cluster security settings, I will want the deployment to fail loudly.

ihcsim on 24 May 2019

👍2

linkerd check --proxy command does not finish, it idles waiting for proxies to be ready:

➜ linkerd check --proxy
kubernetes-api
--------------
√ can initialize the client
√ can query the Kubernetes API

kubernetes-version
------------------
√ is running the minimum Kubernetes API version
√ is running the minimum kubectl version

linkerd-existence
-----------------
√ control plane namespace exists
√ controller pod is running
√ can initialize the client
√ can query the control plane API

linkerd-api
-----------
√ control plane pods are ready
√ control plane self-check
√ [kubernetes] control plane can talk to Kubernetes
√ [prometheus] control plane can talk to Prometheus
√ no invalid service profiles

linkerd-version
---------------
√ can determine the latest version
√ cli is up-to-date

linkerd-data-plane
------------------
√ data plane namespace exists
\ data plane proxies are ready -- waiting for check to complete

That doesn't really communicate what is wrong, or how to resolve it.

csreegn on 28 May 2019

Yeah, with the service account automount being a crucial precondition for mTLS to work, we can be a bit smarter with this check. @olix0r proposed that we update the proxy injector to fail the proxy injection if mTLS is enabled, but the pod spec has automountServiceAccountToken: false.

Will you be interested in submitting a PR? :wink: I'll be happy to show you where the webhook code is. It should be an easy fix, but no pressure though.

ihcsim on 28 May 2019

Sure, I'll try to squeeze it in within a few weeks. Never coded in golang before, but there's a first time for everything!

reegnz on 29 May 2019

What does the linkerd sidecar do with the service account token?

mikedanese on 2 Aug 2019

@mikedanese https://linkerd.io/2/features/automatic-mtls/#how-does-it-work

grampelberg on 2 Aug 2019

Linkerd sidecar injector should inject an audience bound token into the linkerd proxy container. This would provide better security (token bound to identity service would not be replayable against the Kubernetes API, and vice versa) and would also remove linkerd's dependency on automountServiceAccountToken. Doc here:

https://kubernetes.io/docs/tasks/configure-pod-container/configure-service-account/#service-account-token-volume-projection

And more details on the differences to legacy tokens here:

https://github.com/kubernetes/kubernetes/issues/70679#issue-377670415

mikedanese on 2 Aug 2019

👍1

That would be awesome!

grampelberg on 2 Aug 2019

In fact, the docs that @grampelberg linked to specifically call out time- and audience-bound SA tokens as future work once they're available "in future Kubernetes releases". @mikedanese I'm assuming this means they are ready for us to use in modern K8s versions?

wmorgan on 2 Aug 2019

trying to mesh: elasticsearch:7.4.1

│ time="2019-10-29T06:38:02Z" level=info msg="running version stable-2.6.0"
│ time="2019-10-29T06:38:02Z" level=info msg="Using with pre-existing key: /var/run/linkerd/identity/end-entity/key.p8"
│ time="2019-10-29T06:38:02Z" level=info msg="Using with pre-existing CSR: /var/run/linkerd/identity/end-entity/key.p8"
│ configuration error: InvalidEnvVar
│ ERR! [     0.002000s] linkerd2_proxy::app::config Could not read LINKERD2_PROXY_IDENTITY_TOKEN_FILE: No such file or directory (os error 2)
│ ERR! [     0.002021s] linkerd2_proxy::app::config LINKERD2_PROXY_IDENTITY_TOKEN_FILE="/var/run/secrets/kubernetes.io/serviceaccount/token" is not valid: InvalidTokenSource
│

masterkain on 29 Oct 2019

@masterkain are you using an operator? It wouldn't be something related to the image you're using, instead how you're deploying it on the cluster.

grampelberg on 29 Oct 2019

@masterkain @grampelberg
For the cloud on k8s version of elastic installation at least the issue was also an explicit automountServiceAccountToken: false setting on the pod. Fixed in https://github.com/elastic/cloud-on-k8s/issues/1151

zerosym on 29 Oct 2019

I just posted this in the linkerd slack channel, but here is a config that will allow linkerd 2.6 to inject with elastic-cloud-on-k8s, in case this helps anyone else seeing this issue.

apiVersion: elasticsearch.k8s.elastic.co/v1
kind: Elasticsearch
metadata:
  name: elasticsearch
spec:
  version: 7.5.2
  nodeSets:
  - name: default
    count: 1
    podTemplate:
      metadata:
        annotations:
          linkerd.io/inject: "enabled"
      spec:
        automountServiceAccountToken: true

kellybirr on 29 Jan 2020

👍2

Was this page helpful?

4 / 5 - 1 ratings

Related issues

`top` output has no column headers

wmorgan · 3Comments

Add option to inject to set resource limits

coleca · 4Comments

linkerd dashboard does not work on Windows Subsystem for Linux

skalinets · 3Comments

Add support for rate limiting

steve-fraser · 4Comments

Wire up stats and dashboards for Jobs

klingerf · 3Comments

Linkerd2: Fail to inject when mTLS is enabled and there is no service account

Bug Report

What is the issue?

How can it be reproduced?

Logs, error output, etc

linkerd check output

Environment

Possible solution

Most helpful comment

All 17 comments

Related issues

`linkerd check` output