Using kubeadm
and kubernetes 1.8.3 to provision a 3 node cluster on HypriotOS (ARM), I can initialize the master fine, but when joining nodes into the cluster, kube-flannel
crashes on each node with:
Error registering network: failed to configure interface flannel.1: link has incompatible addresses. Remove additional addresses and try again. &{{%!s(int=5) %!s(int=1450) %!s(int=0) flannel.1 76:dd:ec:c0:e6:86 up|broadcast|multicast %!s(uint32=69699) %!s(int=0) %!s(int=0) <nil> %!s(*netlink.LinkStatistics=&{272198 273028 25973338 108272828 0 0 0 45 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0}) %!s(int=0) %!s(*netlink.LinkXdp=<nil>) ether <nil> unknown} %!s(int=1) %!s(int=3) 192.168.0.14 <nil> %!s(int=0) %!s(int=0) %!s(bool=false) %!s(bool=false) %!s(bool=false) %!s(bool=false) %!s(bool=false) %!s(bool=false) %!s(bool=false) %!s(bool=false) %!s(int=300) %!s(int=0) %!s(int=8472) %!s(int=0) %!s(int=0)}
When I run ip addr
I can see that flannel.1
has been given two different IPs, each a /24 apart from each other.
5: flannel.1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UNKNOWN group default
link/ether 76:dd:ec:c0:e6:86 brd ff:ff:ff:ff:ff:ff
inet 10.244.2.0/32 scope global flannel.1
valid_lft forever preferred_lft forever
inet 10.244.1.0/32 scope global flannel.1
valid_lft forever preferred_lft forever
inet6 fe80::74dd:ecff:fec0:e686/64 scope link
valid_lft forever preferred_lft forever
I haven't manually added these addresses, they just appear automatically when flannel starts.
kube-flannel
enters CrashLoopBackoff
and never recovers from the error. Is this a bug, or something I need to update on my host OS? How do I remove the second address if this is a suitable temporary workaround?
The flannel manifest I have applied is: https://raw.githubusercontent.com/coreos/flannel/v0.9.0/Documentation/kube-flannel.yml (sed s/amd64/arm/g
).
The entire log leading up to the error is:
$ kubectl -n kube-system logs kube-flannel-ds-tgs85
I1120 04:39:40.815542 1 main.go:470] Determining IP address of default interface
I1120 04:39:40.817983 1 main.go:483] Using interface with name wlan0 and address 192.168.0.14
I1120 04:39:40.818240 1 main.go:500] Defaulting external address to interface address (192.168.0.14)
I1120 04:39:41.137713 1 kube.go:130] Waiting 10m0s for node controller to sync
I1120 04:39:41.137945 1 kube.go:283] Starting kube subnet manager
I1120 04:39:42.138167 1 kube.go:137] Node controller sync successful
I1120 04:39:42.138306 1 main.go:235] Created subnet manager: Kubernetes Subnet Manager - jasmine
I1120 04:39:42.138370 1 main.go:238] Installing signal handlers
I1120 04:39:42.138873 1 main.go:348] Found network config - Backend type: vxlan
I1120 04:39:42.139127 1 vxlan.go:119] VXLAN config: VNI=1 Port=0 GBP=false DirectRouting=false
E1120 04:39:42.141224 1 main.go:280] Error registering network: failed to configure interface flannel.1: link has incompatible addresses. Remove additional addresses and try again. &{{%!s(int=5) %!s(int=1450) %!s(int=0) flannel.1 76:dd:ec:c0:e6:86 up|broadcast|multicast %!s(uint32=69699) %!s(int=0) %!s(int=0) <nil> %!s(*netlink.LinkStatistics=&{272198 273028 25973338 108272828 0 0 0 45 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0}) %!s(int=0) %!s(*netlink.LinkXdp=<nil>) ether <nil> unknown} %!s(int=1) %!s(int=3) 192.168.0.14 <nil> %!s(int=0) %!s(int=0) %!s(bool=false) %!s(bool=false) %!s(bool=false) %!s(bool=false) %!s(bool=false) %!s(bool=false) %!s(bool=false) %!s(bool=false) %!s(int=300) %!s(int=0) %!s(int=8472) %!s(int=0) %!s(int=0)}
I1120 04:39:42.141346 1 main.go:328] Stopping shutdownHandler...
Kube flannel should start properly, not only on the master, but on all additional nodes that join the cluster. flannel.1
should automatically be allocated a single IPv4 address, rather than two addresses.
Kube flannel starts correctly on the master, but additional nodes that join the cluster appear to allocate two IPv4 addresses to flannel.1
, which causes Kube flannel to enter a crash loop.
Initialize a master node on HypriotOS with kubeadm init
.
Join a secondary node on HypriotOS with kubeadm join
.
Observe that kube-flannel
runs fine on the master, but crashes on the worker node.
This is preventing me from running Kubernetes right now. I previously had a working k8s 1.7 cluster with Flannel 0.7. Only since upgrading Kubernetes and Flannel has this issue started occurring.
Thoughts: is it possible flannel attempts to allocate a single IP, which works, but flannel believes it failed, so it attempts to allocate the IP again, which results in the two addresses?
Update: I fixed this by just ip link delete flannel.1
on the affected nodes. I'm not sure how it started, but flannel was than able to recreate the interface and work.
@d11wtq Interesting, I've not seen this before but certainly let me know if you see this again.
we see the same issue with flannel 0.7.1.
ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN qlen 1
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP qlen 1000
link/ether 00:50:56:a2:ab:59 brd ff:ff:ff:ff:ff:ff
inet 10.10.14.187/23 brd 10.10.15.255 scope global eth0
valid_lft forever preferred_lft forever
3: docker0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UP
link/ether 02:42:ad:b4:c5:7f brd ff:ff:ff:ff:ff:ff
inet 172.30.79.1/24 scope global docker0
valid_lft forever preferred_lft forever
46: flannel.1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UNKNOWN
link/ether 62:d6:6c:70:2a:94 brd ff:ff:ff:ff:ff:ff
inet 172.30.73.0/32 scope global flannel.1
valid_lft forever preferred_lft forever
inet 172.30.79.0/32 scope global flannel.1
valid_lft forever preferred_lft forever
[cloud@psd011 ~]$ etcdctl get /k8s/network/subnets/172.30.73.0-24
Error: 100: Key not found (/k8s/network/subnets/172.30.73.0-24) [1959]
[cloud@psd011 ~]$ etcdctl get /k8s/network/subnets/172.30.79.0-24
{"PublicIP":"10.10.14.187","BackendType":"vxlan","BackendData":{"VtepMAC":"76:ef:58:64:0d:91"}}
Your Environment
Flannel version: 0.7.1
Backend used (e.g. vxlan or udp): vxlan
Etcd version:gcr.io/google_containers/etcd-arm:3.0.17 (?)
Kubernetes version (if used): 1.9.3
Operating System and version: Centos 7.3
Link to your project (optional):
The same issue occurs with Flannel v0.10.0.
14: flannel.1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UNKNOWN group default
link/ether a6:62:08:a6:c2:f0 brd ff:ff:ff:ff:ff:ff
inet 10.244.1.0/32 scope global flannel.1
valid_lft forever preferred_lft forever
inet 169.254.46.120/16 brd 169.254.255.255 scope global flannel.1
valid_lft forever preferred_lft forever
inet6 fe80::a462:8ff:fea6:c2f0/64 scope link
valid_lft forever preferred_lft forever
Flannel goes into CrashLoopBackOff. Deleting the flannel.1 link does solve the issue.
OS: Raspbian Stretch Lite
Kubernetes version: 1.10.2
i get same promblem
Same under K8s 1.12.0, flannel v0.10.0; however, sudo ip link delete flannel.1 did allow it to come up without error. (Hypriot 1.9.0 on ARM - Raspberry Pi 3 B+)
check you etcd data and loacl env file , flanneld will get preview ip when startup
+1, same setup as @wroney688:
-k8s 1.12.0
In my case all the flannel pods initially come up successfully but after ~3 days the flannel pod gets stuck in CrashLoopBackOff
with the following error (other 4 workers are fine):
kubectl -n kube-system logs kube-flannel-ds-arm-fcm6r
I1019 03:16:15.097562 1 main.go:475] Determining IP address of default interface
I1019 03:16:15.099403 1 main.go:488] Using interface with name eth0 and address 192.168.1.246
I1019 03:16:15.099582 1 main.go:505] Defaulting external address to interface address (192.168.1.246)
I1019 03:16:15.734392 1 kube.go:131] Waiting 10m0s for node controller to sync
I1019 03:16:15.794863 1 kube.go:294] Starting kube subnet manager
I1019 03:16:16.995022 1 kube.go:138] Node controller sync successful
I1019 03:16:16.995104 1 main.go:235] Created subnet manager: Kubernetes Subnet Manager - k8s-worker4
I1019 03:16:16.995131 1 main.go:238] Installing signal handlers
I1019 03:16:16.995337 1 main.go:353] Found network config - Backend type: vxlan
I1019 03:16:16.995492 1 vxlan.go:120] VXLAN config: VNI=1 Port=0 GBP=false DirectRouting=false
E1019 03:16:16.996653 1 main.go:280] Error registering network: failed to configure interface flannel.1: failed to ensure address of interface flannel.1: link has incompatible addresses. Remove additional addresses and try again. &netlink.Vxlan{LinkAttrs:netlink.LinkAttrs{Index:16, MTU:1450, TxQLen:0, Name:"flannel.1", HardwareAddr:net.HardwareAddr{0x96, 0x73, 0x59, 0x4a, 0xa2, 0xd6}, Flags:0x13, RawFlags:0x11043, ParentIndex:0, MasterIndex:0, Namespace:interface {}(nil), Alias:"", Statistics:(*netlink.LinkStatistics)(0x13f340e4), Promisc:0, Xdp:(*netlink.LinkXdp)(0x14027100), EncapType:"ether", Protinfo:(*netlink.Protinfo)(nil), OperState:0x0}, VxlanId:1, VtepDevIndex:2, SrcAddr:net.IP{0xc0, 0xa8, 0x1, 0xf6}, Group:net.IP(nil), TTL:0, TOS:0, Learning:false, Proxy:false, RSC:false, L2miss:false, L3miss:false, UDPCSum:true, NoAge:false, GBP:false, Age:300, Limit:0, Port:8472, PortLow:0, PortHigh:0}
I1019 03:16:16.996768 1 main.go:333] Stopping shutdownHandler...
Here's the interfaces on the worker with the failing flannel pod:
$ ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
link/ether b8:27:eb:fa:0d:e4 brd ff:ff:ff:ff:ff:ff
inet 192.168.1.246/24 brd 192.168.1.255 scope global eth0
valid_lft forever preferred_lft forever
inet6 fe80::a775:2ad:bcca:1d44/64 scope link
valid_lft forever preferred_lft forever
3: wlan0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc pfifo_fast state DOWN group default qlen 1000
link/ether b8:27:eb:af:58:b1 brd ff:ff:ff:ff:ff:ff
4: docker0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default
link/ether 02:42:fc:53:bc:ad brd ff:ff:ff:ff:ff:ff
inet 172.17.0.1/16 brd 172.17.255.255 scope global docker0
valid_lft forever preferred_lft forever
16: flannel.1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UNKNOWN group default
link/ether 96:73:59:4a:a2:d6 brd ff:ff:ff:ff:ff:ff
inet 10.244.2.0/32 scope global flannel.1
valid_lft forever preferred_lft forever
inet 169.254.47.220/16 brd 169.254.255.255 scope global flannel.1
valid_lft forever preferred_lft forever
inet6 fe80::697a:866d:c96e:5849/64 scope link
valid_lft forever preferred_lft forever
As with others, running sudo ip link delete flannel.1
resolves it at least temporarily.
Can we re-open this issue? Any logs I can grab next time to help debug?
Most helpful comment
Update: I fixed this by just
ip link delete flannel.1
on the affected nodes. I'm not sure how it started, but flannel was than able to recreate the interface and work.