General Information
Updating systemd 244.2-2 on Arch to systemd 245.2-1 and 245-3 break pod to out-of-node ipv4 traffic. Reverting to 244.2-2 and rebooting fixes the problem. (ipv6 keeps working on all versions)
I did a sysctl -a diff with 244 vs 245 with cilium running (ready):
< net.ipv4.conf.all.promote_secondaries = 1
> net.ipv4.conf.all.promote_secondaries = 0
< net.ipv4.conf.cilium_host.accept_source_route = 1
> net.ipv4.conf.cilium_host.accept_source_route = 0
< net.ipv4.conf.cilium_host.promote_secondaries = 0
> net.ipv4.conf.cilium_host.promote_secondaries = 1
< net.ipv4.conf.cilium_host.rp_filter = 0
> net.ipv4.conf.cilium_host.rp_filter = 2
< net.ipv4.conf.cilium_net.accept_source_route = 1
> net.ipv4.conf.cilium_net.accept_source_route = 0
< net.ipv4.conf.cilium_net.promote_secondaries = 0
> net.ipv4.conf.cilium_net.promote_secondaries = 1
< net.ipv4.conf.default.accept_source_route = 1
> net.ipv4.conf.default.accept_source_route = 0
< net.ipv4.conf.default.promote_secondaries = 0
> net.ipv4.conf.default.promote_secondaries = 1
< net.ipv4.conf.default.rp_filter = 0
> net.ipv4.conf.default.rp_filter = 2
< net.ipv4.conf.ens192.accept_source_route = 1
> net.ipv4.conf.ens192.accept_source_route = 0
< net.ipv4.conf.ens192.promote_secondaries = 0
> net.ipv4.conf.ens192.promote_secondaries = 1
< net.ipv4.conf.ens192.rp_filter = 0
> net.ipv4.conf.ens192.rp_filter = 2
< net.ipv4.conf.lo.accept_source_route = 1
> net.ipv4.conf.lo.accept_source_route = 0
< net.ipv4.conf.lo.promote_secondaries = 0
> net.ipv4.conf.lo.promote_secondaries = 1
< net.ipv4.conf.lo.rp_filter = 0
> net.ipv4.conf.lo.rp_filter = 2
cilium version)uname -a)kubectl version, Mesos, ...)curl -sLO
https://github.com/cilium/cilium-sysdump/releases/latest/download/cilium-sysdump.zip &&
python cilium-sysdump.zip and then attach the generated zip file)The breaking change is in /usr/lib/sysctl.d/50-default.conf
https://github.com/systemd/systemd/commit/5d4fc0e665a3639f92ac880896c56f9533441307#diff-7816eed8ca6324f23a690cc5f58e6bf7
a minimal fix for 245 is:
echo 'net.ipv4.conf.lxc*.rp_filter = 0' | sudo tee -a /etc/sysctl.d/90-override.conf && sudo systemctl start systemd-sysctl
there was a systemd bug that we had in the past. Although it is completely unrelated it might help give some help to figure out the underlying issue https://github.com/cilium/cilium/pull/8351
@nberlee Thanks for reporting!
The systemd behavior is very annoying. At least, we should warn users when systemd > 245 is detected.
Add log statement if user is running with systemd version
I have an Ubuntu 20.04 development machine that I use, where I run microk8s.
I've installed Cilium v1.8.1:
$ cilium version
Client: 1.6.8 f534e98df 2020-03-25T13:32:35+01:00 go version go1.12.17 linux/amd64
Daemon: 1.8.1 5ce2bc7b3 2020-07-02T20:04:47+02:00 go version go1.14.4 linux/amd64
It's running systemd 245:
$ systemctl --version
systemd 245 (245.4-4ubuntu3.2)
+PAM +AUDIT +SELINUX +IMA +APPARMOR +SMACK +SYSVINIT +UTMP +LIBCRYPTSETUP +GCRYPT +GNUTLS +ACL +XZ +LZ4 +SECCOMP +BLKID +ELFUTILS +KMOD +IDN2 -IDN +PCRE2 default-hierarchy=hybrid
Here's the config (I don't recall whether I modified these, but they seem consistent with the case where connectivity was not working for you):
$ grep rp_filter /etc/sysctl.d/*
/etc/sysctl.d/10-network-security.conf:net.ipv4.conf.default.rp_filter=2
/etc/sysctl.d/10-network-security.conf:net.ipv4.conf.all.rp_filter=2
/etc/sysctl.d/99-sysctl.conf:#net.ipv4.conf.default.rp_filter=1
/etc/sysctl.d/99-sysctl.conf:#net.ipv4.conf.all.rp_filter=1
Interestingly it seems to be set to 0, although I've rebooted this machine recently:
$ sysctl net.ipv4.conf.all.rp_filter
net.ipv4.conf.all.rp_filter = 0
I note that parts of Cilium now disable rp_filter:
$ git grep rp_filter pkg/datapath
pkg/datapath/connector/add.go: return sysctl.Disable(fmt.Sprintf("net.ipv4.conf.%s.rp_filter", ifName))
pkg/datapath/loader/base.go: {"net.ipv4.conf.all.rp_filter", "0", false},
Deploying the single-node-connectivity YAML I see that external connectivity works:
https://github.com/cilium/cilium/blob/master/examples/kubernetes/connectivity-check/connectivity-check-single-node.yaml
$ k get po | grep external
pod-to-external-1111-75df4847d7-tjxpq 1/1 Running 1 23m
pod-to-external-fqdn-allow-google-cnp-77d7586f58-9l5z6 1/1 Running 1 23m
So I think this issue is resolved as of the latest Cilium releases?
I will try 1.8.2 tomorrow evening, but I just tested it on 1.7.6 and it is still broken.
Also 1.7.6 seems to have the same interface specific line:
$ grep -ri rp_filter
pkg/endpoint/connector/add.go: return sysctl.Disable(fmt.Sprintf("net.ipv4.conf.%s.rp_filter", ifName))
pkg/datapath/loader/base.go: {"net.ipv4.conf.all.rp_filter", "0", false},
my systemd version right now:
$ systemctl --version
systemd 245 (245.6-8-arch)
+PAM +AUDIT -SELINUX -IMA -APPARMOR +SMACK -SYSVINIT +UTMP +LIBCRYPTSETUP +GCRYPT +GNUTLS +ACL +XZ +LZ4 +SECCOMP +BLKID +ELFUTILS +KMOD +IDN2 -IDN +PCRE2 default-hierarchy=hybrid```
status of rp_filter
$ sysctl -a | grep \\.rp_filter
net.ipv4.conf.all.rp_filter = 0
net.ipv4.conf.cilium_host.rp_filter = 2
net.ipv4.conf.cilium_net.rp_filter = 2
net.ipv4.conf.default.rp_filter = 2
net.ipv4.conf.ens192.rp_filter = 2
net.ipv4.conf.lo.rp_filter = 2
net.ipv4.conf.lxc4d36398ebb25.rp_filter = 2
net.ipv4.conf.lxc6aaba34c27a9.rp_filter = 2
net.ipv4.conf.lxc_health.rp_filter = 0
net.ipv4.conf.lxca210cbb192a4.rp_filter = 2
net.ipv4.conf.lxca3dfd01c8fff.rp_filter = 2
net.ipv4.conf.lxcebe3792a5b6a.rp_filter = 2
net.ipv4.conf.lxcece74f373c8f.rp_filter = 2
net.ipv4.conf.lxcfe2f3a81b538.rp_filter = 2
pinging an outside destination ip.
$ kubectl run --restart=Never -it --rm --image=alpine test2 ash
If you don't see a command prompt, try pressing enter.
/ # ping 9.9.9.9
PING 9.9.9.9 (9.9.9.9): 56 data bytes
^C
--- 9.9.9.9 ping statistics ---
19 packets transmitted, 0 packets received, 100% packet loss
/ # exit
pod "test2" deleted
(it works fine with my 90-override.conf described in the second post of this issue)
Maybe Ubuntu 20.04 has a different /usr/lib/sysctl.d/50-default.conf?
$ grep rp_filter /usr/lib/sysctl.d/50-default.conf
net.ipv4.conf.default.rp_filter = 2
net.ipv4.conf.*.rp_filter = 2
-net.ipv4.conf.all.rp_filter
Maybe Ubuntu 20.04 has a different /usr/lib/sysctl.d/50-default.conf
Good spotting. Ubuntu seems to carry a patch to systemd to explicitly remove those lines. I downloaded the tar from the ubuntu packages site and:
$ grep -R rp_filter systemd_245.4-4ubuntu3.2/debian/patches/*
systemd_245.4-4ubuntu3.2/debian/patches/debian/UBUNTU-drop-kernel.-settings-from-sysctl-defaults-shipped.patch:-net.ipv4.conf.default.rp_filter = 2
systemd_245.4-4ubuntu3.2/debian/patches/debian/UBUNTU-drop-kernel.-settings-from-sysctl-defaults-shipped.patch:-net.ipv4.conf.*.rp_filter = 2
systemd_245.4-4ubuntu3.2/debian/patches/debian/UBUNTU-drop-kernel.-settings-from-sysctl-defaults-shipped.patch:--net.ipv4.conf.all.rp_filter
Furthermore at least in my environment networkd doesn't seem to be enabled:
$ networkctl | grep lxc2cd6411
WARNING: systemd-networkd is not running, output will be incomplete.
357 lxc2cd6411832fb ether n/a unmanaged
I have the same WARNING: systemd-networkd is not running, output will be incomplete. [...] unmanaged on Arch Linux. It seems that rp_filter is being set by systemd-sysctl service.
@joestringer What does systemctl status systemd-sysctl return on your machine?
@brb it's Active (exited), but per my last post I think the Ubuntu version of systemd-sysctl won't apply rp_filter by default. That seems like it explains the difference in behaviour to me.
per my last post I think the Ubuntu version of systemd-sysctl won't apply rp_filter by default
@joestringer Ah, damn, missed that comment. Yeah, that explains the difference.
Yep, caught the same problem during first cilium installation (found this bug right after).
Just overrided rp_filter settings in /etc/sysctl.d/90-override.conf
Gentoo, Systemd 245.5
Can confirm this issue is still happening on Cilium 1.8.2 and using the latest 2605 flatcar release channel
I've hit the same problem Ubuntu 20.04.
For future googlers on hetzner systems: Check /etc/sysctl.d/99-hetzner.conf, they set net.ipv4.conf.all.rp_filter=1 there.
I am not sure if related but after attempting a 1.8.2 -> 1.8.3 upgrade on Ubuntu 20.04 / 5.4.0-1021-aws, agent pods end up crashing with the following logs. Rolling back to 1.8.2, pods are healthy and do not contain those errors. 馃し
{"error":"Failed to sysctl -w net.ipv4.conf.eth0.rp_filter=2: could not open the sysctl file /proc/sys/net/ipv4/conf/eth0/rp_filter: open /proc/sys/net/ipv4/conf/eth0/rp_filter: no such file or directory","level":"error","msg":"Error while initializing daemon","subsys":"daemon"}
{"error":"Failed to sysctl -w net.ipv4.conf.eth0.rp_filter=2: could not open the sysctl file /proc/sys/net/ipv4/conf/eth0/rp_filter: open /proc/sys/net/ipv4/conf/eth0/rp_filter: no such file or directory","level":"fatal","msg":"Error while creating daemon","subsys":"daemon"}
{"error":"Operation cannot be fulfilled on ciliumnodes.cilium.io \"ip-10-6-11-13.eu-west-1.compute.internal\": the object has been modified; please apply your changes to the latest version and try again","level":"warning","msg":"Unable to update CiliumNode custom resource","subsys":"ipam"}
{"level":"info","msg":"regenerating all endpoints","reason":"one or more identities created or deleted","subsys":"endpoint-manager"}
{"level":"info","msg":"regenerating all endpoints","reason":"one or more identities created or deleted","subsys":"endpoint-manager"}
@mvisonneau Is there an eth0 interface on those nodes? If not, then I think this is a separate bug (regression) in v1.8.3 on Ubuntu in EKS environments. If there is an eth0 then it may still be this issue.
馃憢 @joestringer, indeed my instances are based over the AWS Nitro System which gets me network interfaces with the en[0-9]+ format.
interfaces list
~$ netstat -i
Kernel Interface table
Iface MTU RX-OK RX-ERR RX-DRP RX-OVR TX-OK TX-ERR TX-DRP TX-OVR Flg
cilium_h 9001 0 0 0 0 0 0 0 0 BMORU
cilium_n 9001 0 0 0 0 0 0 0 0 BMORU
docker0 1500 0 0 0 0 0 0 0 0 BMU
ens5 9001 77873091 0 0 0 114673441 0 0 0 BMRU
ens6 9001 605028990 0 74 0 495146454 0 0 0 BMRU
ens7 9001 64204300 0 0 0 59610683 0 0 0 BMRU
lo 65536 135179116 0 0 0 135179116 0 0 0 LRU
lxc84712 9001 5109046 0 0 0 5763573 0 0 0 BMRU
lxc053a0 9001 1566665 0 0 0 1104132 0 0 0 BMRU
lxc17cb3 9001 280988 0 0 0 251064 0 0 0 BMRU
lxc225b5 9001 6816453 0 0 0 5947230 0 0 0 BMRU
lxc25c93 9001 3330495 0 0 0 3939436 0 0 0 BMRU
lxc2ae26 9001 116270 0 0 0 141742 0 0 0 BMRU
lxc3b24a 9001 205922198 0 0 0 263466611 0 0 0 BMRU
lxc3d661 9001 4308719 0 0 0 6791656 0 0 0 BMRU
lxc45541 9001 2294528 0 0 0 2473290 0 0 0 BMRU
lxc49be6 9001 3975092 0 0 0 2414246 0 0 0 BMRU
lxc5f6ec 9001 148957 0 0 0 148039 0 0 0 BMRU
lxc676a3 9001 1057937 0 0 0 645921 0 0 0 BMRU
lxc67811 9001 236109688 0 0 0 206980719 0 0 0 BMRU
lxc6b001 9001 169636 0 0 0 168709 0 0 0 BMRU
lxc75d86 9001 19918743 0 0 0 16686086 0 0 0 BMRU
lxc7d310 9001 70644814 0 0 0 47580110 0 0 0 BMRU
lxc_heal 9001 1628264 0 0 0 1932101 0 0 0 BMRU
lxcb9ade 9001 2623227 0 0 0 3048972 0 0 0 BMRU
lxcc56bb 9001 15626813 0 0 0 21041815 0 0 0 BMRU
lxccd212 9001 234663 0 0 0 305566 0 0 0 BMRU
lxcd656e 9001 145880 0 0 0 145195 0 0 0 BMRU
lxcd7296 9001 5129261 0 0 0 4645878 0 0 0 BMRU
lxcdecd9 9001 2764864 0 0 0 3648134 0 0 0 BMRU
lxce1a74 9001 32706 0 0 0 32298 0 0 0 BMRU
lxcef306 9001 21973644 0 0 0 21831440 0 0 0 BMRU
lxcf535c 9001 24108700 0 0 0 27435746 0 0 0 BMRU
@mvisonneau OK great, would you mind filing a separate bug for that to help track fixing the regression? The output from your last couple of comments on this thread would be a great start for such a bug.
On Ubuntu hosted by Hetzner i dont have any config added by hetzner self anymore. But had same issue with systemd. So i added sysctrl configuration and got cilium 1.8.3 working.
ubuntu version 20.04.1
kernel 5.8.10-050810-generic
docker 19.3.13
kubernetes 1.19.2
systemd 245 (245.4-4ubuntu3.2)
net.ipv4.conf.lxc*.rp_filter = 0
net.ipv4.conf.all.rp_filter = 0
net.ipv4.conf.default.rp_filter = 0
A good workaround for this is to enable endpoint-routes --enable-endpoint-routes. It enforces symmetric routing. https://github.com/cilium/cilium/pull/13346 is going to fix endpoint-routes in combination with tunneling.
enable-endpoint-routes: "true" with default systemd 2.45 (now 46) rp_filter works great using native routing.
Followup items:
@errordeveloper mentioned he can share some systemd configuration he used to mitigate this issue.
If we want to enable endpoint routes mode by default, we will also need to resolve #13121.
I have encountered this on OpenShift, which uses CoreOS.
I can confirm that following two solutions worked well.
Either write /etc/sysctl.d/99-override_cilium_rp_filter.conf with the following contents:
net.ipv4.conf.lxc*.rp_filter = 0
net.ipv4.conf.cilium_*.rp_filter = 0
Or use enable-endpoint-routes: "true", however if you are using tunnelling mode, you will require either Cilium 1.8.5 (not yet released due to be released soon), or 1.9.0 (also due to be released) (see https://github.com/cilium/cilium/pull/13346).
@joestinger https://github.com/cilium/cilium/issues/10645#issuecomment-601451909
Is there context from upstream systemd release around this change?
https://github.com/systemd/systemd/commit/5d4fc0e665a3639f92ac880896c56f9533441307#diff-7816eed8ca6324f23a690cc5f58e6bf7 whch solved issue https://github.com/systemd/systemd/issues/6282
Using flatcar 2605 and above, the minimal fix: echo 'net.ipv4.conf.lxc*.rp_filter = 0' | sudo tee -a /etc/sysctl.d/90-override.conf && sudo systemctl start systemd-sysctl works. But if the node is rebooted and the pod get's a new IP address on the same node, it stops working.
@jaysiyani that sounds like you need to find the correct way to persist configuration on Flatcar, I can have one very concrete example in the docs that you might want to try.