Cilium: Health endpoint does not bind to tcp6 on a IPv6 only environment

Created on 14 Sep 2020 · 21Comments · Source: cilium/cilium

Bug report

General Information

Cilium version 1.8.3
Kernel version 5.8.7
Kubernetes 1.19.1
cilium-sysdump-20200914-131007.zip

How to reproduce the issue

Install K8s 1.19.1 using kubeadm with IPv6 only, skipping addon/kube-proxy
Install cilium 1.8.3 using the attached config
Monitor logs and READY state of "cilium" pod
- issue: cilium never becomes READY, health endpoint restarting endlessly
Create a pod like this and try to "ping6" your default gateway by IP (IPv6)
- issue: connections or pings to nodes outside the cluster not possible, even if IP ranges are fully routable

kinbug kincommunity-report needtriage

Source

lyind

Most helpful comment

Discussed offline with @tklauser . We will go with the proposal from https://github.com/cilium/cilium/issues/13165#issuecomment-693463210

Also, one addition that we need to perform is to change the liveness and readiness probes to perform the requests on 127.0.0.1 or ::1 depending on the enable-ipv4 flag. If that flag is set to false we should change the readiness probe to perform requests to ::1, otherwise it should default to 127.0.0.1 (as it works for both v4 and v6 environments).

aanm on 17 Sep 2020

👍2

All 21 comments

What I already tried:

configure node with single 10GbE interface instead of bonding: same result
set net.ipv4.conf.all.rp_filter=0: same result
configuring cilium to coexist with kube-proxy (default): same result
private (ULA) addresses for pod-network and services: same result

lyind on 14 Sep 2020

Thanks for the report!

Which guide did you follow to deploy Cilium? Should masquerading be enabled?

Also, does this setup work if you're not in IPv6 only?

pchaigno on 14 Sep 2020

I followed the QuickInstall guide initially, found that cilium-operator didn't work due to some side issue, looked at the objects, replaced a few "127.0.0.1" with "[::1]", took all the YAML, put some placeholders and integrated it into my automation. Then reinstalled the cluster. Initially everything looks good, but pods can't reach outside the cluster (i/o timeout). Also I am a bit disappointed that XDP acceleration isn't available when using the Linux bonding driver, but that's not Cilium's issue.

I had masquerading enabled in the "private IPv6 pod/service network" case, but when using global IPv6 addresses it shouldn't be required. Thus I turned it off.

I didn't try IPv4, as I only have IPv6 here and no IPv4 gateway. (There is only a management box that runs an internal IPv4 network for PXE boot.)

I want a cluster of a few machines which exchange routes to each others, non-overlapping, podCIDRs, where the pods have IPv6 addresses only (at least for non loopback interfaces).

Since I uploaded that extensive sysdump, are there any obvious misconfigurations visible (like wrong combination of settings in the config map)?

lyind on 14 Sep 2020

Is it enough if I assign IPv4 podCIDR ranges in the CiliumNodes and configure configmap/cilium-config accordingly, to test if IPv4 works?

lyind on 14 Sep 2020

@lyind it should be enough if you run with enable-ipv6: falseset in the configmap/cilium-config

aanm on 14 Sep 2020

@pchaigno I put enable-ipv6: false and enable-ipv4: true.

The cilium pods still do not become READY (waited long enough) and continuously restart. Cilium has stopped to hand out IPv6 addresses or forwarding IPv6 traffic. This means pods like coredns don't work anymore (no route to host).

lyind on 14 Sep 2020

I noticed that, when I set enable-health-check-nodeport: false (from true):

level=info msg="Envoy: Starting xDS gRPC server listening on /var/run/cilium/xds.sock" subsys=envoy-manager
level=info msg="Restoring endpoints..." subsys=daemon
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x18 pc=0x1d3aa37]

goroutine 159 [running]:
github.com/cilium/cilium/pkg/service.(*Service).UpsertService(0xc000310d90, 0xc000e0cb40, 0x0, 0x0, 0x0)
    /go/src/github.com/cilium/cilium/pkg/service/service.go:289 +0xdf7
github.com/cilium/cilium/pkg/k8s/watchers.(*K8sWatcher).addK8sSVCs(0xc000e40000, 0xc000b44160, 0xa, 0xc000b44170, 0x7, 0x0, 0xc000eb0080, 0xc000888000, 0xc0004d3798, 0x43e2ac)
    /go/src/github.com/cilium/cilium/pkg/k8s/watchers/watcher.go:810 +0x46a
github.com/cilium/cilium/pkg/k8s/watchers.(*K8sWatcher).k8sServiceHandler.func1(0x0, 0xc000b44160, 0xa, 0xc000b44170, 0x7, 0xc000eb0080, 0x0, 0xc000888000, 0xc000da6450)
    /go/src/github.com/cilium/cilium/pkg/k8s/watchers/watcher.go:502 +0xa4e
github.com/cilium/cilium/pkg/k8s/watchers.(*K8sWatcher).k8sServiceHandler(0xc000e40000)
    /go/src/github.com/cilium/cilium/pkg/k8s/watchers/watcher.go:545 +0x95
created by github.com/cilium/cilium/pkg/k8s/watchers.(*K8sWatcher).RunK8sServiceHandler
    /go/src/github.com/cilium/cilium/pkg/k8s/watchers/watcher.go:550 +0x3f

Not sure why someone would want to disable the health check, though.

lyind on 14 Sep 2020

@pchaigno OK, fixed it. Replaced the ::1 in the DaemonSet/cilium livenessProbe address with 127.0.0.1 again and it works. Seems like the cilium health HTTP server only listens on 0.0.0.0 instead of ::.

Now the only remaining question is why pod traffic can't leave the cluster, even though they have global IPs. But there are open issues already, so I am going to wait for now.

Thank you!

lyind on 14 Sep 2020

This still needs to be addressed. Thanks for testing this and find out the root cause

aanm on 14 Sep 2020

@aanm Thanks for the great support.

Regarding the fix: Usually one should bind to :: when IPv6 is available, only if not bind to 0.0.0.0 when writing/configuring services (like web servers). Exclusively binding to :: works (also listens on IPv4) but makes it impossible to run on a kernel without IPv6 support (rare but happens).

lyind on 14 Sep 2020

@lyind just to clarify, you have out ::1 in the liveness probe of the Cilium daemon set and that fixed the issue of the health probe being restart all the time?

aanm on 15 Sep 2020

@aanm I initially configured the livenessProbe as ::1 thinking something like "Oh yay! IPv6 localhost!". But the service targeted by the livenessProbe doesn't bind to :: or ::1 but to IPv4 only. So I corrected my own mistake. It would be nice of course if all services listened to :: or ::1 on IPv6 enabled hosts.

lyind on 15 Sep 2020

@tklauser can you take a look? It's really odd why localhost does not bind to v6 address. I've tried it on minikube but I realized that does not have v6 localhost address.

aanm on 15 Sep 2020

👀1 👍1

@tklauser can you take a look? It's really odd why localhost does not bind to v6 address. I've tried it on minikube but I realized that does not have v6 localhost address.

It looks like this is because of golang/go#9334, i.e there currently is no safe way for listen for both tcp4 and tcp6. I'm not really sure what to do here except suggesting users to leave the livenessProbe and readinessProbe set to 127.0.0.1 even if they're using IPv6-only stacks.

tklauser on 16 Sep 2020

@tklauser Thank you for following up on this.

Reading through the Go net.Listen() issue and other similar ones for other projects (eg. https://bugs.python.org/issue3213), I found the operating system situation makes this hard to solve correctly in user-space (also see https://serverfault.com/questions/21657/semantics-of-and-0-0-0-0-in-dual-stack-oses ).

Also as a note to myself: the way to go should be (language/framework/OS agnostic):

Make bind addresses configurable by the admin/user/operator
[optional] Bind to each supported address family (AF) and handle errors gracefully. Ie. call bind() with 127.0.0.1 and ::1 (localhost) or 0.0.0.0 and :: (any-address). This is easy to do when using even based IO via select()/poll() in C. Similar can be done in Go by using one shared handler in multiple calls to net.Listen().

Go example code:

package main

import (
    "net/http";
    "fmt";
    "log"
)

func main() {
    finish := make(chan bool)

    mux := http.NewServeMux()
    mux.HandleFunc("/", handleRequest)

    go func() {
        log.Print(http.ListenAndServe("127.0.0.1:8001", mux))
    }()

    go func() {
        log.Print(http.ListenAndServe("[::1]:8001", mux))
    }()

    <-finish
}

func handleRequest(w http.ResponseWriter, r *http.Request) {
    w.Write([]byte(fmt.Sprintf("Hello %s\n", r.RemoteAddr)))
}

The above server will listen on both, IPv4 and IPv6 localhost or any of them which can be tested as follows:

$ curl 127.0.0.1:8001 ; curl [::1]:8001
Hello 127.0.0.1:37658
Hello [::1]:50176

lyind on 16 Sep 2020

BTW, thanks to all people working on Cilium, it is the CNI plugin that makes Kubernetes networking useful and easy to handle at the same time.

lyind on 16 Sep 2020

🎉1

@lyind Thanks for investigating further and for providing the example code. I think we should be able to use that approach to listen on both 127.0.0.1 and ::1 for the health check.

tklauser on 16 Sep 2020

@tklauser can you take a look? It's really odd why localhost does not bind to v6 address. I've tried it on minikube but I realized that does not have v6 localhost address.

It looks like this is because of golang/go#9334, i.e there currently is no safe way for listen for both tcp4 and tcp6. I'm not really sure what to do here except suggesting users to leave the livenessProbe and readinessProbe set to 127.0.0.1 even if they're using IPv6-only stacks.

@tklauser why can't we listen on :8080 (instead of localhost:8080)?

aanm on 16 Sep 2020

@tklauser why can't we listen on :8080 (instead of localhost:8080)?

I tried ":8080" and it binds to "[::]:8080" on my system which is convenient. I don't know if the health endpoint should be a public interface, though.

Also, using ":8080" may cause issues if bindv6only=1 (Linux) or if a non-dualstack IPv6 socket is used (Windows).
While I think bindv6only=1 is rare, I imagine someone could actually try to run K8s on some more recent version of Windows.

lyind on 16 Sep 2020

@tklauser can you take a look? It's really odd why localhost does not bind to v6 address. I've tried it on minikube but I realized that does not have v6 localhost address.

It looks like this is because of golang/go#9334, i.e there currently is no safe way for listen for both tcp4 and tcp6. I'm not really sure what to do here except suggesting users to leave the livenessProbe and readinessProbe set to 127.0.0.1 even if they're using IPv6-only stacks.

@tklauser why can't we listen on :8080 (instead of localhost:8080)?

@aanm as @lyind pointed out in https://github.com/cilium/cilium/issues/13165#issuecomment-693571908, using :8080 will bind on all available interfaces, i.e. it will become a public endpoint. The original intention was to restrict the health check endpoint to the loopback interface. Not sure whether we want to keep that? I guess the attack surface is rather minimal and we might restrict access to it based on the http.Requests RemoteAddr field. Thoughts?

tklauser on 17 Sep 2020

Discussed offline with @tklauser . We will go with the proposal from https://github.com/cilium/cilium/issues/13165#issuecomment-693463210

aanm on 17 Sep 2020

👍2

Was this page helpful?

0 / 5 - 0 ratings

Related issues

transparent encryption: setting nodEncryption=true results in nodes being unreachable

kkourt · 19Comments

CI: xDS hosts cache has duplicate IP/policy entries for the host IP. (L7Policies test failed due to error configuring proxy redirects)

ianvernon · 40Comments

1.1 Release Planning

tgraf · 19Comments

Cilium needs ip6tables rules to route IPv6 packets

paolodedios · 19Comments

kube-proxy replacement: LoadBalancer traffic fails from host back to same host

travisghansen · 38Comments