General Information
How to reproduce the issue
What I already tried:
net.ipv4.conf.all.rp_filter=0: same resultThanks for the report!
Which guide did you follow to deploy Cilium? Should masquerading be enabled?
Also, does this setup work if you're not in IPv6 only?
I followed the QuickInstall guide initially, found that cilium-operator didn't work due to some side issue, looked at the objects, replaced a few "127.0.0.1" with "[::1]", took all the YAML, put some placeholders and integrated it into my automation. Then reinstalled the cluster. Initially everything looks good, but pods can't reach outside the cluster (i/o timeout). Also I am a bit disappointed that XDP acceleration isn't available when using the Linux bonding driver, but that's not Cilium's issue.
I had masquerading enabled in the "private IPv6 pod/service network" case, but when using global IPv6 addresses it shouldn't be required. Thus I turned it off.
I didn't try IPv4, as I only have IPv6 here and no IPv4 gateway. (There is only a management box that runs an internal IPv4 network for PXE boot.)
I want a cluster of a few machines which exchange routes to each others, non-overlapping, podCIDRs, where the pods have IPv6 addresses only (at least for non loopback interfaces).
Since I uploaded that extensive sysdump, are there any obvious misconfigurations visible (like wrong combination of settings in the config map)?
Is it enough if I assign IPv4 podCIDR ranges in the CiliumNodes and configure configmap/cilium-config accordingly, to test if IPv4 works?
@lyind it should be enough if you run with enable-ipv6: falseset in the configmap/cilium-config
@pchaigno I put enable-ipv6: false and enable-ipv4: true.
The cilium pods still do not become READY (waited long enough) and continuously restart. Cilium has stopped to hand out IPv6 addresses or forwarding IPv6 traffic. This means pods like coredns don't work anymore (no route to host).
I noticed that, when I set enable-health-check-nodeport: false (from true):
level=info msg="Envoy: Starting xDS gRPC server listening on /var/run/cilium/xds.sock" subsys=envoy-manager
level=info msg="Restoring endpoints..." subsys=daemon
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x18 pc=0x1d3aa37]
goroutine 159 [running]:
github.com/cilium/cilium/pkg/service.(*Service).UpsertService(0xc000310d90, 0xc000e0cb40, 0x0, 0x0, 0x0)
/go/src/github.com/cilium/cilium/pkg/service/service.go:289 +0xdf7
github.com/cilium/cilium/pkg/k8s/watchers.(*K8sWatcher).addK8sSVCs(0xc000e40000, 0xc000b44160, 0xa, 0xc000b44170, 0x7, 0x0, 0xc000eb0080, 0xc000888000, 0xc0004d3798, 0x43e2ac)
/go/src/github.com/cilium/cilium/pkg/k8s/watchers/watcher.go:810 +0x46a
github.com/cilium/cilium/pkg/k8s/watchers.(*K8sWatcher).k8sServiceHandler.func1(0x0, 0xc000b44160, 0xa, 0xc000b44170, 0x7, 0xc000eb0080, 0x0, 0xc000888000, 0xc000da6450)
/go/src/github.com/cilium/cilium/pkg/k8s/watchers/watcher.go:502 +0xa4e
github.com/cilium/cilium/pkg/k8s/watchers.(*K8sWatcher).k8sServiceHandler(0xc000e40000)
/go/src/github.com/cilium/cilium/pkg/k8s/watchers/watcher.go:545 +0x95
created by github.com/cilium/cilium/pkg/k8s/watchers.(*K8sWatcher).RunK8sServiceHandler
/go/src/github.com/cilium/cilium/pkg/k8s/watchers/watcher.go:550 +0x3f
Not sure why someone would want to disable the health check, though.
@pchaigno OK, fixed it. Replaced the ::1 in the DaemonSet/cilium livenessProbe address with 127.0.0.1 again and it works. Seems like the cilium health HTTP server only listens on 0.0.0.0 instead of ::.
Now the only remaining question is why pod traffic can't leave the cluster, even though they have global IPs. But there are open issues already, so I am going to wait for now.
Thank you!
This still needs to be addressed. Thanks for testing this and find out the root cause
@aanm Thanks for the great support.
Regarding the fix: Usually one should bind to :: when IPv6 is available, only if not bind to 0.0.0.0 when writing/configuring services (like web servers). Exclusively binding to :: works (also listens on IPv4) but makes it impossible to run on a kernel without IPv6 support (rare but happens).
@lyind just to clarify, you have out ::1 in the liveness probe of the Cilium daemon set and that fixed the issue of the health probe being restart all the time?
@aanm I initially configured the livenessProbe as ::1 thinking something like "Oh yay! IPv6 localhost!". But the service targeted by the livenessProbe doesn't bind to :: or ::1 but to IPv4 only. So I corrected my own mistake. It would be nice of course if all services listened to :: or ::1 on IPv6 enabled hosts.
@tklauser can you take a look? It's really odd why localhost does not bind to v6 address. I've tried it on minikube but I realized that does not have v6 localhost address.
@tklauser can you take a look? It's really odd why
localhostdoes not bind to v6 address. I've tried it on minikube but I realized that does not have v6 localhost address.
It looks like this is because of golang/go#9334, i.e there currently is no safe way for listen for both tcp4 and tcp6. I'm not really sure what to do here except suggesting users to leave the livenessProbe and readinessProbe set to 127.0.0.1 even if they're using IPv6-only stacks.
@tklauser Thank you for following up on this.
Reading through the Go net.Listen() issue and other similar ones for other projects (eg. https://bugs.python.org/issue3213), I found the operating system situation makes this hard to solve correctly in user-space (also see https://serverfault.com/questions/21657/semantics-of-and-0-0-0-0-in-dual-stack-oses ).
Also as a note to myself: the way to go should be (language/framework/OS agnostic):
bind() with 127.0.0.1 and ::1 (localhost) or 0.0.0.0 and :: (any-address). This is easy to do when using even based IO via select()/poll() in C. Similar can be done in Go by using one shared handler in multiple calls to net.Listen().Go example code:
package main
import (
"net/http";
"fmt";
"log"
)
func main() {
finish := make(chan bool)
mux := http.NewServeMux()
mux.HandleFunc("/", handleRequest)
go func() {
log.Print(http.ListenAndServe("127.0.0.1:8001", mux))
}()
go func() {
log.Print(http.ListenAndServe("[::1]:8001", mux))
}()
<-finish
}
func handleRequest(w http.ResponseWriter, r *http.Request) {
w.Write([]byte(fmt.Sprintf("Hello %s\n", r.RemoteAddr)))
}
The above server will listen on both, IPv4 and IPv6 localhost or any of them which can be tested as follows:
$ curl 127.0.0.1:8001 ; curl [::1]:8001
Hello 127.0.0.1:37658
Hello [::1]:50176
BTW, thanks to all people working on Cilium, it is the CNI plugin that makes Kubernetes networking useful and easy to handle at the same time.
@lyind Thanks for investigating further and for providing the example code. I think we should be able to use that approach to listen on both 127.0.0.1 and ::1 for the health check.
@tklauser can you take a look? It's really odd why
localhostdoes not bind to v6 address. I've tried it on minikube but I realized that does not have v6 localhost address.It looks like this is because of golang/go#9334, i.e there currently is no safe way for listen for both tcp4 and tcp6. I'm not really sure what to do here except suggesting users to leave the
livenessProbeandreadinessProbeset to127.0.0.1even if they're using IPv6-only stacks.
@tklauser why can't we listen on :8080 (instead of localhost:8080)?
@tklauser why can't we listen on
:8080(instead oflocalhost:8080)?
I tried ":8080" and it binds to "[::]:8080" on my system which is convenient. I don't know if the health endpoint should be a public interface, though.
Also, using ":8080" may cause issues if bindv6only=1 (Linux) or if a non-dualstack IPv6 socket is used (Windows).
While I think bindv6only=1 is rare, I imagine someone could actually try to run K8s on some more recent version of Windows.
@tklauser can you take a look? It's really odd why
localhostdoes not bind to v6 address. I've tried it on minikube but I realized that does not have v6 localhost address.It looks like this is because of golang/go#9334, i.e there currently is no safe way for listen for both tcp4 and tcp6. I'm not really sure what to do here except suggesting users to leave the
livenessProbeandreadinessProbeset to127.0.0.1even if they're using IPv6-only stacks.@tklauser why can't we listen on
:8080(instead oflocalhost:8080)?
@aanm as @lyind pointed out in https://github.com/cilium/cilium/issues/13165#issuecomment-693571908, using :8080 will bind on all available interfaces, i.e. it will become a public endpoint. The original intention was to restrict the health check endpoint to the loopback interface. Not sure whether we want to keep that? I guess the attack surface is rather minimal and we might restrict access to it based on the http.Requests RemoteAddr field. Thoughts?
Discussed offline with @tklauser . We will go with the proposal from https://github.com/cilium/cilium/issues/13165#issuecomment-693463210
Also, one addition that we need to perform is to change the liveness and readiness probes to perform the requests on 127.0.0.1 or ::1 depending on the enable-ipv4 flag. If that flag is set to false we should change the readiness probe to perform requests to ::1, otherwise it should default to 127.0.0.1 (as it works for both v4 and v6 environments).
Most helpful comment
Discussed offline with @tklauser . We will go with the proposal from https://github.com/cilium/cilium/issues/13165#issuecomment-693463210
Also, one addition that we need to perform is to change the liveness and readiness probes to perform the requests on
127.0.0.1or::1depending on theenable-ipv4flag. If that flag is set tofalsewe should change the readiness probe to perform requests to::1, otherwise it should default to127.0.0.1(as it works for both v4 and v6 environments).