Linkerd2: Can't connect to headless service

Created on 22 Aug 2019  ยท  11Comments  ยท  Source: linkerd/linkerd2

Bug Report

What is the issue?

Linkerd proxy returns 503 Service Unavailable but should return answer from external service instead.

How can it be reproduced?

apiVersion: v1
kind: Namespace
metadata:
  name: repro
  annotations:
    linkerd.io/inject: enabled
---
apiVersion: v1
kind: Service
metadata:
  name: some-external-svc
  namespace: repro
spec:
  ports:
  - port: 9753
    protocol: TCP
    targetPort: 80
---
apiVersion: v1
kind: Endpoints
metadata:
  name: some-external-svc
  namespace: repro
subsets:
- addresses:
  - ip: 1.1.1.1
  ports:
  - port: 80

Attach a shell in an alpine repro pod with the following cmd: kubectl run repro -n repro --rm -it --image=alpine --restart=Never --generator=run-pod/v1

Install curl in pod with apk add curl

Request some-external-svc with curl: curl -IXGET some-external-svc:9753

Logs, error output, etc

Response from linkerd proxy: 503 Service Unavailable

Expect answer with status code <500 from external server.

linkerd check output

kubernetes-api
--------------
โˆš can initialize the client                                                                                                                                                                  โˆš can query the Kubernetes API

kubernetes-version
------------------
โˆš is running the minimum Kubernetes API version                                                                                                                                              โˆš is running the minimum kubectl version

linkerd-config
--------------
โˆš control plane Namespace exists                                                                                                                                                             โˆš control plane ClusterRoles exist
โˆš control plane ClusterRoleBindings exist
โˆš control plane ServiceAccounts exist
โˆš control plane CustomResourceDefinitions exist
โˆš control plane MutatingWebhookConfigurations exist
โˆš control plane ValidatingWebhookConfigurations exist
โˆš control plane PodSecurityPolicies exist

linkerd-existence
-----------------
โˆš 'linkerd-config' config map exists                                                                                                                                                         โˆš control plane replica sets are ready
โˆš no unschedulable pods
โˆš controller pod is running
โˆš can initialize the client
โˆš can query the control plane API

linkerd-api
-----------
โˆš control plane pods are ready                                                                                                                                                               โˆš control plane self-check
โˆš [kubernetes] control plane can talk to Kubernetes                                                                                                                                          โˆš [prometheus] control plane can talk to Prometheus                                                                                                                                          โˆš no invalid service profiles

linkerd-version
---------------
โˆš can determine the latest version                                                                                                                                                           โˆš cli is up-to-date                                                                                                                                                                          
control-plane-version
---------------------
โˆš control plane is up-to-date                                                                                                                                                                โˆš control plane and cli versions match                                                                                                                                                       
Status check results are โˆš

Environment

  • Kubernetes Version: 1.13.5
  • Cluster Environment: AKS
  • Host OS: Ubuntu
  • Linkerd version: stable-2.5.0

Possible solution

It is possible to get it to work if I pass a host header along with the target ip (eg. Host:1.1.1.1). However this would negate the need for an headless service. The point of which is to provide a common cluster internal name for the external service, so that the other cluster deployments does't need to know about the external ip.

Additional context

This scenario works just fine without the linkerd proxy.

arecontroller areproxy bug help wanted

All 11 comments

You can use type: ExternalName as a workaround for now.

Thanks for catching this @JohannesEH.

You can use type: ExternalName as a workaround for now.

@grampelberg thanks for the suggestion. You are absolutely right, and in most setups this is probably fairly easy to configure. In the setup I'm working with it is somewhat cumbersome. So in the spirit of "it just works" I would rather wait till linkerd supports our kind of setup, which I'm sure it will. :)

@JohannesEH agreed, I wish that the k8s API's handling of service/endpoint combos was a little more robust. Shouldn't be a hard fix if you're interested in taking a look =)

@grampelberg My schedule is pretty full, but I might be able to find some time to look at it on sunday. However, I must confess I only have little experience with go and zero experience with Rust, so I might be a little slow to get started. And I worry if I can find enough time to really dig in and get the job done.

Do you have any pointers on where I should start looking?

@adleong any pointers?

I think this is caused by logic in the destination service:

https://github.com/linkerd/linkerd2/blob/master/controller/api/destination/watcher/endpoints_watcher.go#L426

The destination service will only return endpoints which are pods. In this case, none of the endpoints are pods and so it returns an empty address set.

Any chances to get this in next release? It looks like blocker for us to use linkerd2

@StupidScience it won't make the next stable (2.6) but it should be in the edge immediately after merging.

In the meantime, could anyone explain how the type: ExternalName workaround mentioned can be configured?

@robertgates55 2.6 went out awhile ago. The workaround is to just have an external name service instead of managing endpoints manually.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

olix0r picture olix0r  ยท  3Comments

adleong picture adleong  ยท  4Comments

vikas027 picture vikas027  ยท  4Comments

tustvold picture tustvold  ยท  4Comments

ihcsim picture ihcsim  ยท  4Comments