Linkerd2: Can't connect to headless service

Created on 22 Aug 2019 · 11Comments · Source: linkerd/linkerd2

Bug Report

What is the issue?

Linkerd proxy returns 503 Service Unavailable but should return answer from external service instead.

How can it be reproduced?

apiVersion: v1
kind: Namespace
metadata:
  name: repro
  annotations:
    linkerd.io/inject: enabled
---
apiVersion: v1
kind: Service
metadata:
  name: some-external-svc
  namespace: repro
spec:
  ports:
  - port: 9753
    protocol: TCP
    targetPort: 80
---
apiVersion: v1
kind: Endpoints
metadata:
  name: some-external-svc
  namespace: repro
subsets:
- addresses:
  - ip: 1.1.1.1
  ports:
  - port: 80

Attach a shell in an alpine repro pod with the following cmd: kubectl run repro -n repro --rm -it --image=alpine --restart=Never --generator=run-pod/v1

Install curl in pod with apk add curl

Request some-external-svc with curl: curl -IXGET some-external-svc:9753

Logs, error output, etc

Response from linkerd proxy: 503 Service Unavailable

Expect answer with status code <500 from external server.

`linkerd check` output

kubernetes-api
--------------
√ can initialize the client                                                                                                                                                                  √ can query the Kubernetes API

kubernetes-version
------------------
√ is running the minimum Kubernetes API version                                                                                                                                              √ is running the minimum kubectl version

linkerd-config
--------------
√ control plane Namespace exists                                                                                                                                                             √ control plane ClusterRoles exist
√ control plane ClusterRoleBindings exist
√ control plane ServiceAccounts exist
√ control plane CustomResourceDefinitions exist
√ control plane MutatingWebhookConfigurations exist
√ control plane ValidatingWebhookConfigurations exist
√ control plane PodSecurityPolicies exist

linkerd-existence
-----------------
√ 'linkerd-config' config map exists                                                                                                                                                         √ control plane replica sets are ready
√ no unschedulable pods
√ controller pod is running
√ can initialize the client
√ can query the control plane API

linkerd-api
-----------
√ control plane pods are ready                                                                                                                                                               √ control plane self-check
√ [kubernetes] control plane can talk to Kubernetes                                                                                                                                          √ [prometheus] control plane can talk to Prometheus                                                                                                                                          √ no invalid service profiles

linkerd-version
---------------
√ can determine the latest version                                                                                                                                                           √ cli is up-to-date                                                                                                                                                                          
control-plane-version
---------------------
√ control plane is up-to-date                                                                                                                                                                √ control plane and cli versions match                                                                                                                                                       
Status check results are √

Environment

Kubernetes Version: 1.13.5
Cluster Environment: AKS
Host OS: Ubuntu
Linkerd version: stable-2.5.0

Possible solution

It is possible to get it to work if I pass a host header along with the target ip (eg. Host:1.1.1.1). However this would negate the need for an headless service. The point of which is to provide a common cluster internal name for the external service, so that the other cluster deployments does't need to know about the external ip.

Additional context

This scenario works just fine without the linkerd proxy.

arecontroller areproxy bug help wanted

Source

JohannesEH

All 11 comments

You can use type: ExternalName as a workaround for now.

grampelberg on 22 Aug 2019

Thanks for catching this @JohannesEH.

wmorgan on 22 Aug 2019

👍1

You can use type: ExternalName as a workaround for now.

@grampelberg thanks for the suggestion. You are absolutely right, and in most setups this is probably fairly easy to configure. In the setup I'm working with it is somewhat cumbersome. So in the spirit of "it just works" I would rather wait till linkerd supports our kind of setup, which I'm sure it will. :)

JohannesEH on 26 Aug 2019

@JohannesEH agreed, I wish that the k8s API's handling of service/endpoint combos was a little more robust. Shouldn't be a hard fix if you're interested in taking a look =)

grampelberg on 27 Aug 2019

@grampelberg My schedule is pretty full, but I might be able to find some time to look at it on sunday. However, I must confess I only have little experience with go and zero experience with Rust, so I might be a little slow to get started. And I worry if I can find enough time to really dig in and get the job done.

Do you have any pointers on where I should start looking?

JohannesEH on 28 Aug 2019

@adleong any pointers?

grampelberg on 28 Aug 2019

I think this is caused by logic in the destination service:

https://github.com/linkerd/linkerd2/blob/master/controller/api/destination/watcher/endpoints_watcher.go#L426

The destination service will only return endpoints which are pods. In this case, none of the endpoints are pods and so it returns an empty address set.