Kubeadm: Issue with inter-host communication in a multi-host k8s cluster with Flannel

Created on 31 Jan 2017 · 5Comments · Source: kubernetes/kubeadm

After installing a Kubernetes cluster using kubeadm on a machines running vanilla Ubuntu 16.04 (following the steps in this doc: https://kubernetes.io/docs/getting-started-guides/kubeadm/) and setting up Flannel network add-on, we are not able to communicate with pods deployed on a different host (e.g. from the master node to a pod deployed on a worker node, or between pods deployed on different worker nodes).

Our deployment steps:

MASTER:

kubeadm init --pod-network-cidr=10.244.0.0/16 --api-advertise-addresses=192.168.56.11
kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml

WORKER(s):

kubeadm join --token=525be4.d44dce5cc2c64fb3 192.168.56.11

Deployment (on master):

kubectl run nginx --image=nginx --replicas=2

# kubectl get nodes
NAME              STATUS         AGE
ubuntu-server-1   Ready,master   56s
ubuntu-server-2   Ready          13s

# kubectl get pods
NAME                    READY     STATUS    RESTARTS   AGE
kube-flannel-ds-tsqng   2/2       Running   0          23s
kube-flannel-ds-vv4dd   2/2       Running   0          48s
nginx-701339712-b9hkx   1/1       Running   0          15s
nginx-701339712-wlqj8   1/1       Running   0          15s

# kubectl describe pod nginx-701339712-b9hkx 
Name:                  nginx-701339712-b2xmf
Namespace:       default
Node:                   ubuntu-server-2/10.0.2.15
Start Time:         Fri, 27 Jan 2017 23:21:25 +0100
Labels:                 pod-template-hash=701339712
                              run=nginx
Status:                 Running
IP:                         10.244.1.5

Ping from the master does not work:

root@ubuntu-server-1:~# ping 10.244.1.5
PING 10.244.1.5 (10.244.1.5) 56(84) bytes of data.
^C
--- 10.244.1.5 ping statistics ---
8 packets transmitted, 0 received, 100% packet loss, time 7021ms

Ping from the worker where the pod is running works:

root@ubuntu-server-2:~# ping 10.244.1.5
PING 10.244.1.5 (10.244.1.5) 56(84) bytes of data.
64 bytes from 10.244.1.5: icmp_seq=1 ttl=64 time=0.156 ms

(master) List of interfaces & stats. Containers are correctly connected to cni0 bridge (shows non-zero counters). flannel.1 TX counters are non-negative.

root@ubuntu-server-1:/home/rasto# netstat -i
Kernel Interface table
Iface   MTU Met   RX-OK RX-ERR RX-DRP RX-OVR    TX-OK TX-ERR TX-DRP TX-OVR Flg
cni0       1450 0      1201      0      0 0          1231      0      0      0 BMRU
docker0    1500 0         0      0      0 0             0      0      0      0 BMU
enp0s3     1500 0        74      0      0 0            84      0      0      0 BMRU
enp0s8     1500 0      5154      0      0 0          5575      0      0      0 BMRU
flannel.1  1450 0         0      0      0 0             0      0     14      0 BMRU
lo        65536 0     49230      0      0 0         49230      0      0      0 LRU
veth2bec76c0  1450 0      1020      0      0 0          1061      0      0      0 BMRU

(master) Routing table looks good:

# route
Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
default         10.0.2.2        0.0.0.0         UG    0      0        0 enp0s3
10.0.2.0        *               255.255.255.0   U     0      0        0 enp0s3
10.244.0.0      *               255.255.255.0   U     0      0        0 cni0
10.244.0.0      *               255.255.0.0     U     0      0        0 flannel.1
172.17.0.0      *               255.255.0.0     U     0      0        0 docker0
192.168.56.0    *               255.255.255.0   U     0      0        0 enp0s8

(worker) no flannel.1 interface ???

# netstat -i
Kernel Interface table
Iface   MTU Met   RX-OK RX-ERR RX-DRP RX-OVR    TX-OK TX-ERR TX-DRP TX-OVR Flg
cni0       1450 0        10      0      0 0            12      0      0      0 BMRU
docker0    1500 0         0      0      0 0             0      0      0      0 BMU
enp0s3     1500 0       122      0      0 0           128      0      0      0 BMRU
enp0s8     1500 0      7369      0      0 0          6237      0      0      0 BMRU
lo        65536 0       168      0      0 0           168      0      0      0 LRU
veth3f949122  1450 0         3      0      0 0            21      0      0      0 BMRU
vethff0406c4  1450 0         7      0      0 0            24      0      0      0 BMRU

(not sure whether this is an issue with kubeadm, Flannel, or both)

Source

rastislavs

Most helpful comment

This seems to be the issue: https://github.com/coreos/flannel/issues/535

The solution is to specify interface for launching flanneld (e.g. "--iface=enp0s8" in my case) in kube-flannel.yml:

root@ubuntu-server-1:~# diff kube-flannel.yml kube-flannel-mod.yml 
52c52
<         command: [ "/opt/bin/flanneld", "--ip-masq", "--kube-subnet-mgr" ]
---
>         command: [ "/opt/bin/flanneld", "--ip-masq", "--kube-subnet-mgr", "--iface=enp0s8" ]

+ the static routes mentioned above (e.g. ip route add 10.96.0.0/12 dev enp0s8 in my case).

rastislavs on 1 Feb 2017

👍6 🚀1

All 5 comments

BTW, in order to make Flannel working on the worker node, I had to manually add the file /run/flannel/subnet.env there, otherwise I was getting this error:

Error syncing pod, skipping: failed to "SetupNetwork" for "nginx-701339712-wkf7z_default" with SetupNetworkError: "Failed to setup network for pod \"nginx-701339712-wkf7z_default(a5fc14ec-e7fa-11e6-8f28-080027fb0a94)\" using network plugins \"cni\": open /run/flannel/subnet.env: no such file or directory; Skipping pod"

# kubectl describe pods

  FirstSeen LastSeen    Count   From                SubObjectPath   Type        Reason      Message
  --------- --------    -----   ----                -------------   --------    ------      -------
  1m        1m      1   {default-scheduler }                Normal      Scheduled   Successfully assigned nginx-701339712-wkf7z to ubuntu-server-2
  1m        1s      42  {kubelet ubuntu-server-2}           Warning     FailedSync  Error syncing pod, skipping: failed to "SetupNetwork" for "nginx-701339712-wkf7z_default" with SetupNetworkError: "Failed to setup network for pod \"nginx-701339712-wkf7z_default(a5fc14ec-e7fa-11e6-8f28-080027fb0a94)\" using network plugins \"cni\": open /run/flannel/subnet.env: no such file or directory; Skipping pod"

The content I added there:

FLANNEL_NETWORK=10.244.0.0/16
FLANNEL_SUBNET=10.244.1.1/24
FLANNEL_MTU=1450
FLANNEL_IPMASQ=true

rastislavs on 31 Jan 2017

Looks like you are using vagrant, for this to work, you have to set a static route from the second interface, to kube-dns, so the address that you used in kubeadm init --api-advertise-addresses=192.168.56.11 on the nodes.

ip route add 10.96.0.0/12 dev enp0s8 in your case.

Also mentioned in #113
Also have a look at the limitations, the vagrant section, and make sure that /etcd/hosts file get's fixed.

coeki on 1 Feb 2017

Hi, thanks for the suggestions!

I'm not using Vagrant, but I'm still using Virtualbox VMs. /etc/hosts is fixed:

root@ubuntu-server-1:~# hostname -i
192.168.56.11

root@ubuntu-server-2:~# hostname -i
192.168.56.12

The static route (ip route add 10.96.0.0/12 dev enp0s8) resolved the issue with /run/flannel/subnet.env, now flannel seems to be working correctly on worker, I also see the flannel.1 interface has been correctly created there.

However, the ping from master to the container on worker still does not work.

On master, the traffic seems to go through the flannel.1 interface (non-zero TX counters):

root@ubuntu-server-1:~# netstat -i
Kernel Interface table
Iface   MTU Met   RX-OK RX-ERR RX-DRP RX-OVR    TX-OK TX-ERR TX-DRP TX-OVR Flg
cni0       1450 0      1576      0      0 0          1608      0      0      0 BMRU
docker0    1500 0         0      0      0 0             0      0      0      0 BMU
enp0s3     1500 0    204062      0      0 0         98753      0      0      0 BMRU
enp0s8     1500 0      2689      0      0 0          2681      0      0      0 BMRU
flannel.1  1450 0        17      0      0 0            17      0      8      0 BMRU
lo        65536 0     49083      0      0 0         49083      0      0      0 LRU
veth8f4d5722  1450 0      1576      0      0 0          1618      0      0      0 BMRU

But on worker, no traffic goes through the flannel.1 interface (zero RX & TX counters):

root@ubuntu-server-2:~# netstat -i
Kernel Interface table
Iface   MTU Met   RX-OK RX-ERR RX-DRP RX-OVR    TX-OK TX-ERR TX-DRP TX-OVR Flg
cni0       1450 0         6      0      0 0             8      0      0      0 BMRU
docker0    1500 0         0      0      0 0             0      0      0      0 BMU
enp0s3     1500 0    120120      0      0 0         57754      0      0      0 BMRU
enp0s8     1500 0      2460      0      0 0          1979      0      0      0 BMRU
flannel.1  1450 0         0      0      0 0             0      0      8      0 BMRU
lo        65536 0       164      0      0 0           164      0      0      0 LRU
veth041fb4a1  1450 0         3      0      0 0            20      0      0      0 BMRU
veth5133c37c  1450 0         3      0      0 0            20      0      0      0 BMRU

Note that there are 8 TX-OVR packets on the flannel.1 interface.

rastislavs on 1 Feb 2017

This seems to be the issue: https://github.com/coreos/flannel/issues/535

The solution is to specify interface for launching flanneld (e.g. "--iface=enp0s8" in my case) in kube-flannel.yml:

root@ubuntu-server-1:~# diff kube-flannel.yml kube-flannel-mod.yml 
52c52
<         command: [ "/opt/bin/flanneld", "--ip-masq", "--kube-subnet-mgr" ]
---
>         command: [ "/opt/bin/flanneld", "--ip-masq", "--kube-subnet-mgr", "--iface=enp0s8" ]

+ the static routes mentioned above (e.g. ip route add 10.96.0.0/12 dev enp0s8 in my case).

rastislavs on 1 Feb 2017

👍6 🚀1

I too had the same problem. Resolved after adding --iface=eth1, in my case it is eth1, in flannel.yml and with a static route.

Thank you.