flannel cross node traffic does not work with latest systemd 242 due to a race

Created on 3 Jul 2019  路  3Comments  路  Source: coreos/flannel

Expected Behavior

Cross node pod traffic should work, node to pod traffic should work across nodes.

Current Behavior

When running flannel with systemd 242+ there seems to be a race condition between flannel programming the mac address of the flannel.1 interface and systemd programming the mac address on the virtual interface. This results in all cross node traffic being dropped at layer 2 on the destination node due to incorrect destination vtep mac.

With systemd 242 the default policy is setup to be MACAddressPolicy=persistent

/usr/lib/systemd/network/99-default.link

[Link]
NamePolicy=keep kernel database onboard slot path
MACAddressPolicy=persistent

When flannel brings up the interface it programs the mac address and systemd then reprograms it again.

In the trace below you will see

clear@clr-02 ~ $ ip addr show flannel.1
4: flannel.1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UNKNOWN group default
    link/ether d6:02:e3:df:ea:7a brd ff:ff:ff:ff:ff:ff
    inet 10.244.1.0/32 scope global flannel.1
       valid_lft forever preferred_lft forever

But the arp tables on remote nodes are setup with a different mac address d6:02:e3:df:ea:7a vs 5e:89:db:49:c6:a4

clear@clr-01 ~ $ ip neigh
10.244.1.0 dev flannel.1 lladdr 5e:89:db:49:c6:a4 PERMANENT

Looking at the netlink traces you see the mac address being changed twice, the first time by flannel and the second time to a different address by systemd based on its default policy

clear@clr-02 ~ $ sudo ip monitor all
[NETCONF]inet flannel.1 forwarding on rp_filter off mc_forwarding off proxy_neigh off ignore_routes_with_linkdown off
[NETCONF]inet6 flannel.1 forwarding off proxy_neigh off ignore_routes_with_linkdown off
[LINK]4: flannel.1: <BROADCAST,MULTICAST> mtu 1450 qdisc noop state DOWN group default
    link/ether 5e:89:db:49:c6:a4 brd ff:ff:ff:ff:ff:ff
[LINK]4: flannel.1: <BROADCAST,MULTICAST> mtu 1450 qdisc noop state DOWN group default
    link/ether d6:02:e3:df:ea:7a brd ff:ff:ff:ff:ff:ff
[ADDR]4: flannel.1    inet 10.244.1.0/32 scope global flannel.1
       valid_lft forever preferred_lft forever

Possible Solution

  • The user can setup a specific mac address policy of MACAddressPolicy=none on the flannel* interface on each system which hides the issue, but requires node level changes
    or
  • Flannel can watch for mac address changes on the link and reprogram

Steps to Reproduce (for bugs)

  1. kubeadm init
  2. kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/62e44c867a2846fefb68bd5f178daf4da3095ccb/Documentation/kube-flannel.yml
  3. Ping pod on remote node

Context

Flannel and any flannel based network plugins stop working with systemd 242 (Canal).
This will impact other distributions when they upgrade to systemd 242 and beyond.

Your Environment

Most helpful comment

Here's a quick documentation of the workaround (at least this worked in my lab):
~~~
cat<<'EOF'>/etc/systemd/network/10-flannel.link
[Match]
OriginalName=flannel*

[Link]
MACAddressPolicy=none
EOF
~~~
After this, I rebooted my controllers and workers and flannel's overlay worked.

All 3 comments

Here's a quick documentation of the workaround (at least this worked in my lab):
~~~
cat<<'EOF'>/etc/systemd/network/10-flannel.link
[Match]
OriginalName=flannel*

[Link]
MACAddressPolicy=none
EOF
~~~
After this, I rebooted my controllers and workers and flannel's overlay worked.

Looking at the cross-references here, I think more people are stepping on this. Perhaps it would be worth carrying the link unit in this repo, so that it's easier for people to notice and install it.

Just got bit by this issue, spent several hours trying to understand why a single node can't communicate with others. At least until flanneld is killed and then it suddenly works.

Tnx for reporting this issue in detail @mcastelino!

Yeah, many will be bitten and pull hair over this...

Was this page helpful?
0 / 5 - 0 ratings

Related issues

SyCode7 picture SyCode7  路  4Comments

Dieken picture Dieken  路  6Comments

benmoss picture benmoss  路  6Comments

lvthillo picture lvthillo  路  6Comments

smarkm picture smarkm  路  4Comments