Cilium: Cilium needs ip6tables rules to route IPv6 packets

Created on 29 Nov 2018  路  19Comments  路  Source: cilium/cilium

General Information

Cilium version: 1.3.0
Kernel version: 4.19.5
Operating System: Centos 7.5
Orchestration system: Kubernetes 1.12.2

Summary

On dual IPv4/IPv6 network stack hosts, Cilium does not properly route packets to pods that initiate connections to IPv6 addresses. Recursive DNS queries that attempt to resolve AAAA addresses to nameservers, for example, get lost when Cilium hands over the request to the host OS network stack. Running cilium monitor in this instance results in:

<- endpoint 8330 flow 0x116bfb1d identity 52799->0 state new ifindex 0: [f00d::a2a:4100:0:208a]:37470 -> [2600:9000:5303:e800::1]:53 udp
-> stack flow 0x116bfb1d identity 52799->2 state new ifindex 0: [f00d::a2a:4100:0:208a]:37470 -> [2600:9000:5303:e800::1]:53 udp

No packets are returned and the request times-out on the pod, while the host running the pod can reach the IPv6 address via dig, curl, and ping.

IPv6 forwarding is enabled on the host.

$ sudo sysctl -a | grep net.ipv6.conf | grep forward

net.ipv6.conf.all.forwarding = 1
net.ipv6.conf.cilium_health.forwarding = 1
net.ipv6.conf.cilium_host.forwarding = 1
net.ipv6.conf.cilium_net.forwarding = 1
net.ipv6.conf.cilium_vxlan.forwarding = 1
net.ipv6.conf.default.forwarding = 1
net.ipv6.conf.docker0.forwarding = 1

But cilium does not yet install ip6table rules to route traffic to pods through IPv6 interfaces.

$ sudo ip6tables-save |grep cilium

$ sudo iptables-save |grep cilium
-A POSTROUTING -m comment --comment "cilium-feeder: CILIUM_POST" -j CILIUM_POST
-A CILIUM_POST ! -s 10.42.66.1/32 ! -d 10.42.66.0/24 -o cilium_host -m comment --comment "cilium host->cluster masquerade" -j SNAT --to-source 10.42.66.1
-A CILIUM_POST -s 10.42.66.0/24 -o cilium_host -m comment --comment "cilium hostport loopback masquerade" -j SNAT --to-source 10.42.66.1
-A CILIUM_POST -s 10.42.66.0/24 ! -d 10.42.66.0/24 ! -o cilium_+ -m comment --comment "cilium masquerade non-cluster" -j MASQUERADE
-A POSTROUTING -m comment --comment "cilium-feeder: CILIUM_POST_mangle" -j CILIUM_POST_mangle
-A CILIUM_POST_mangle -o cilium_host -m comment --comment "cilium: clear masq bit for pkts to cilium_host" -j MARK --set-xmark 0x0/0x4000
-A FORWARD -m comment --comment "cilium-feeder: CILIUM_FORWARD" -j CILIUM_FORWARD
-A OUTPUT -m comment --comment "cilium-feeder: CILIUM_OUTPUT" -j CILIUM_OUTPUT
-A CILIUM_FORWARD -o cilium_host -m comment --comment "cilium: any->cluster on cilium_host forward accept" -j ACCEPT
-A CILIUM_FORWARD -s 10.42.66.0/24 -m comment --comment "cilium: cluster->any forward accept" -j ACCEPT
-A CILIUM_OUTPUT -m mark ! --mark 0xa00/0xe00 -m comment --comment "cilium: host->any mark as from host" -j MARK --set-xmark 0xc00/0xf00

How to reproduce the issue

  1. Enable IPv4 and IPv6 network on host
  2. Deploy cilium on Kubernetes
  3. Start interactive shell on a pod and try to get a response from the commands:
dig AAAA google.com @2606:4700:4700::1111

or

curl -v http://[2607:f8b0:400a:804::200e]
aredatapath kincommunity-report

Most helpful comment

[root@nrp-01 ~]# ip6tables -D INPUT -j REJECT --reject-with icmp6-adm-prohibited
ip6tables: No chain/target/match by that name.
[root@nrp-01 ~]# ip6tables -D FORWARD -j REJECT --reject-with icmp6-adm-prohibited
ip6tables: No chain/target/match by that name.

Any updates on this? Trying cilium first time, and was hoping to get ipv6 working..

All 19 comments

Temporary Workaround

After a discussing this issue with @tgraf and @aanm on Slack, the following temporary workaround was established.

  1. Install an ip6tables MASQUERADE rule for IPv6 traffic leaving the node.
ip6tables -t nat -A POSTROUTING ! -o cilium_+ -s f00d::/16 -j MASQUERADE
  1. On CentOS/RHEL/Fedora systems remove the default REJECT rules on the INPUT and FORWARD chains.
ip6tables -D INPUT -j REJECT --reject-with icmp6-adm-prohibited
ip6tables -D FORWARD -j REJECT --reject-with icmp6-adm-prohibited
  1. On CentOS/RHEL/Fedora systems, Reverse Path Forwarding is used to prevent packets that arrived through one interface from leaving through a different interface. Reverse Path Forwarding is enabled by default via the /etc/firewalld/firewalld.conf property 'IPv6_rpfilter=yes' and the filter will need to be disabled via 'IPv6_rpfilter=no' to allow asymmetric routing.

A more comprehensive ruleset similar to IPv4 is still needed.

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.

still alive

[root@nrp-01 ~]# ip6tables -D INPUT -j REJECT --reject-with icmp6-adm-prohibited
ip6tables: No chain/target/match by that name.
[root@nrp-01 ~]# ip6tables -D FORWARD -j REJECT --reject-with icmp6-adm-prohibited
ip6tables: No chain/target/match by that name.

Any updates on this? Trying cilium first time, and was hoping to get ipv6 working..

Is this still broken in the new release?

Any estimate on fixing this?

[Unit]
Description=IPv6 Hack for Cilium

[Service]
Type=oneshot
ExecStart=/usr/sbin/ip6tables -t nat -A POSTROUTING ! -o cilium_+ -s f00d::/16 -j MASQUERADE

[Install]
WantedBy=network-online.target

as a follow-up to @paolodedios suggested workaround, here is a systemd-unit which can be deployed on CoreOS.

Finally still not fixed at 1.6.5 required to add rule on all nodes with cilium for access to cluster

In code there is only exist iptables post routing execution

https://github.com/cilium/cilium/blob/master/pkg/datapath/iptables/iptables.go#L830-L852

Looks like IPv6 masquerade still broken on 1.8.0 even when using BPF masquerade mode.

Looks like IPv6 masquerade still broken on 1.8.0 even when using BPF masquerade mode.

The upcoming v1.8.0 won't support BPF masquerading for IPv6. The motivation for not adding it is that we assume that pod IPv6 addrs are globally routable.

That's unfortunate. I never saw a single Kubernetes installation document that advices to use globally routable addr for podCIDR.
And it make simple deployment far more complex, as you are now required to use BGP or something else to advertise each node podCIDR to the upstream routers.

@Jean-Daniel Then what's the motivation to use IPv6 for pods? Underlying network?

Being able to talk to IPV6-only sites outside of k8s

@Jean-Daniel Then what's the motivation to use IPv6 for pods? Underlying network?

Maybe there is something I miss, but actually I have a couple of pods I want to expose though IPv4 and IPv6 services (this is a dual stack Kubernetes). My Services have globally routable addresses that are advertised using MetalLB to the router.

Everything is working fine, but if I understand correctly, IPv6 services requires that underlying pods also have an IPv6 address. Of course, as my pods are dual stack, they can use the IPv4 to access external network when needed. I'm just trying to understand why something as convenient as masquerading pods address should be reserved to IPv4. If there is a use case for IPv4, why doesn't it apply to IPv6 too (especially as most if not all Kubernetes install instructions are recommending using private address for both IPv4 and IPv6 pods CIDR).

Being able to talk to IPV6-only sites outside of k8s

@dimm0 Have you tried using NAT46?

I have some dual-stack nodes in my cluster, with routable IPV4 and IPV6 IPs. When I deploy an ingress on those nodes, I can serve my apps to IPV4 clients without doing anything additional. But I can not do that to IPV6 clients. In my understanding this kind of IPV6 support is broken.

No, I haven't tried NAT46

Also pods on those nodes can not access external IPV6 addresses

The reason this has not been implemented yet is not because the idea or feature is being rejected. The work has simply not been done by anyone. If you want this functionality, open a PR to add the functionality.

If you don't want to add it natively to Cilium, instructing nodes to perform masquerading is not that hard, simply add a rule like this to each of your nodes:

ip6tables -t nat -A POSTROUTING ! -o cilium_+ -s f00d::/16 -j MASQUERADE

This can be done in a script, an init container, with a DaemonSet, among many other ways.

If you want this functionality, open a PR to add the functionality.

Adding a single iptables rule to other rules looks like a 5-minute work for somebody having the dev environment and testing set up already, and I don't understand why the critical part of IPV6 support has not been implemented for a year and half. For me figuring out the whole dev environment and studying the code to find where to add it to will take way more time than actually adding the change.

Was this page helpful?
0 / 5 - 0 ratings