Kubeadm: The product_uuid and the hostname should be unique across nodes

Created on 22 Nov 2016 · 13Comments · Source: kubernetes/kubeadm

_From @vganapathy1 on October 26, 2016 6:50_

Is this a BUG REPORT or FEATURE REQUEST? (choose one):
BUG REPORT

Kubernetes version (use kubectl version):

Client Version: version.Info{Major:"1", Minor:"4", GitVersion:"v1.4.4", GitCommit:"3b417cc4ccd1b8f38ff9ec96bb50a81ca0ea9d56", GitTreeState:"clean", BuildDate:"2016-10-21T02:48:38Z", GoVersion:"go1.6.3", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"4", GitVersion:"v1.4.1", GitCommit:"33cf7b9acbb2cb7c9c72a10d6636321fb180b159", GitTreeState:"clean", BuildDate:"2016-10-10T18:13:36Z", GoVersion:"go1.6.3", Compiler:"gc", Platform:"linux/amd64"}

`root@Ubuntu14041-mars08:/# kubeadm version
kubeadm version: version.Info{Major:"1", Minor:"5+", GitVersion:"v1.5.0-alpha.1.409+714f816a349e79", GitCommit:"714f816a349e7978bc93b35c67ce7b9851e53a6f", GitTreeState:"clean", BuildDate:"2016-10-17T13:01:29Z", GoVersion:"go1.6.3", Compiler:"gc", Platform:"linux/amd64"}
`

Environment:
VMware vCloud Air VM

Cloud provider or hardware configuration:
7.797 GiB RAM, 2 CPUs each
OS (e.g. from /etc/os-release):

 NAME="Ubuntu"
 VERSION="16.04.1 LTS (Xenial Xerus)"

Kernel (e.g. uname -a):
Kernel Version: 4.4.0-43-generic
Install tools:
Others:

What happened:
We used kubeadm and the procedure in Installing Kubernetes on Linux with kubeadm and for the most part the installation went well.

weve-cube installation failed with the Peer name collision,

INFO: 2016/10/26 05:24:32.585405 ->[10.63.33.46:6783] attempting connection INFO: 2016/10/26 05:24:32.587778 ->[10.63.33.46:6783|72:47:96:69:16:bb(Ubuntu14041-mars09)]: connection shutting down due to error: local "72:47:96:69:16:bb(Ubuntu14041-mars09)" and remote "72:47:96:69:16:bb(Ubuntu14041-mars08)" peer names collision

What you expected to happen:
weve-cube installation should have successful and brought the kube-dns up!

How to reproduce it (as minimally and precisely as possible):

On master node:
kubeadm init --api-advertise-addresses=$IP

On Node:
kubeadm join --token $actualtoken $IP

Installed wave-cube as below,
`# kubectl apply -f https://git.io/weave-kube

daemonset "weave-net" created`

kube-dns didn't not start as expected,

Both master and node gets assigned with the same HWaddr causing name collision
` on Master
docker logs a65253346635
INFO: 2016/10/26 05:24:20.719919 Command line options: map[ipalloc-range:10.32.0.0/12 nickname:Ubuntu14041-mars08 no-dns:true docker-api: http-addr:127.0.0.1:6784 ipalloc-init:consensus=2 datapath:datapath name:72:47:96:69:16:bb port:6783]
INFO: 2016/10/26 05:24:20.730839 Communication between peers is unencrypted.
INFO: 2016/10/26 05:24:20.971010 Our name is 72:47:96:69:16:bb(Ubuntu14041-mars08)

On Node,
docker logs a65253346635
INFO: 2016/10/26 05:23:39.312294 Command line options: map[datapath:datapath ipalloc-range:10.32.0.0/12 name:72:47:96:69:16:bb port:6783 docker-api: http-addr:127.0.0.1:6784 ipalloc-init:consensus=2 nickname:Ubuntu14041-mars09 no-dns:true]
INFO: 2016/10/26 05:23:39.314095 Communication between peers is unencrypted.
INFO: 2016/10/26 05:23:39.323302 Our name is 72:47:96:69:16:bb(Ubuntu14041-mars09)
`

CUrling kube-apiserver from a node:
root@Ubuntu14041-mars09:~# curl -k https://10.0.0.1 Unauthorized

nslookup on both master & node
root@Ubuntu14041-mars08:/# curl -k https://10.0.0.1 Unauthorized root@Ubuntu14041-mars08:/# nslookup kubernetes.default Server: 10.30.48.37 Address: 10.30.48.37#53 ** server can't find kubernetes.default: NXDOMAIN

Anything else do we need to know:
iptables -S

`root@Ubuntu14041-mars08:/# iptables -S
-P INPUT ACCEPT
-P FORWARD ACCEPT
-P OUTPUT ACCEPT
-N DOCKER
-N DOCKER-ISOLATION
-N KUBE-FIREWALL
-N KUBE-SERVICES
-N WEAVE-NPC
-N WEAVE-NPC-DEFAULT
-N WEAVE-NPC-INGRESS
-A INPUT -j KUBE-FIREWALL
-A INPUT -d 172.17.0.1/32 -i docker0 -p tcp -m tcp --dport 6783 -j DROP
-A INPUT -d 172.17.0.1/32 -i docker0 -p udp -m udp --dport 6783 -j DROP
-A INPUT -d 172.17.0.1/32 -i docker0 -p udp -m udp --dport 6784 -j DROP
-A INPUT -i docker0 -p udp -m udp --dport 53 -j ACCEPT
-A INPUT -i docker0 -p tcp -m tcp --dport 53 -j ACCEPT
-A FORWARD -i docker0 -o weave -j DROP
-A FORWARD -j DOCKER-ISOLATION
-A FORWARD -o docker0 -j DOCKER
-A FORWARD -o docker0 -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT
-A FORWARD -i docker0 ! -o docker0 -j ACCEPT
-A FORWARD -i docker0 -o docker0 -j ACCEPT
-A FORWARD -o weave -j WEAVE-NPC
-A FORWARD -o weave -j LOG --log-prefix "WEAVE-NPC:"
-A FORWARD -o weave -j DROP
-A OUTPUT -m comment --comment "kubernetes service portals" -j KUBE-SERVICES
-A OUTPUT -j KUBE-FIREWALL
-A DOCKER-ISOLATION -j RETURN
-A KUBE-FIREWALL -m comment --comment "kubernetes firewall for dropping marked packets" -m mark --mark 0x8000/0x8000 -j DROP
-A KUBE-SERVICES -d 10.0.0.10/32 -p udp -m comment --comment "kube-system/kube-dns:dns has no endpoints" -m udp --dport 53 -j REJECT --reject-with icmp-port-unreachable
-A KUBE-SERVICES -d 10.0.0.10/32 -p tcp -m comment --comment "kube-system/kube-dns:dns-tcp has no endpoints" -m tcp --dport 53 -j REJECT --reject-with icmp-port-unreachable
-A WEAVE-NPC -m state --state RELATED,ESTABLISHED -j ACCEPT
-A WEAVE-NPC -m state --state NEW -j WEAVE-NPC-DEFAULT
-A WEAVE-NPC -m state --state NEW -j WEAVE-NPC-INGRESS
-A WEAVE-NPC-DEFAULT -m set --match-set weave-k?Z;25^M}|1s7P3|H9i;*;MhG dst -j ACCEPT
-A WEAVE-NPC-DEFAULT -m set --match-set weave-iuZcey(5DeXbzgRFs8Szo]<@p dst -j ACCEPT
`

kube-proxy-amd logs had the following entries,

`I1026 06:39:50.083990       1 iptables.go:339] running iptables-restore [--noflush --counters]
I1026 06:39:50.093036       1 proxier.go:751] syncProxyRules took 58.207586ms
I1026 06:39:50.093063       1 proxier.go:523] OnEndpointsUpdate took 58.262934ms for 4 endpoints
I1026 06:39:50.970922       1 config.go:99] Calling handler.OnEndpointsUpdate()
I1026 06:39:50.974755       1 proxier.go:758] Syncing iptables rules
I1026 06:39:50.974769       1 iptables.go:362] running iptables -N [KUBE-SERVICES -t filter]
I1026 06:39:50.976635       1 healthcheck.go:86] LB service health check mutation request Service: default/kubernetes - 0 Endpoints []
I1026 06:39:50.978146       1 iptables.go:362] running iptables -N [KUBE-SERVICES -t nat]
I1026 06:39:50.980501       1 iptables.go:362] running iptables -C [OUTPUT -t filter -m comment --comment kubernetes service portals -j KUBE-SERVICES]
I1026 06:39:50.982778       1 iptables.go:362] running iptables -C [OUTPUT -t nat -m comment --comment kubernetes service portals -j KUBE-SERVICES]
I1026 06:39:50.984762       1 iptables.go:362] running iptables -C [PREROUTING -t nat -m comment --comment kubernetes service portals -j KUBE-SERVICES]
I1026 06:39:50.986536       1 iptables.go:362] running iptables -N [KUBE-POSTROUTING -t nat]
I1026 06:39:50.988244       1 iptables.go:362] running iptables -C [POSTROUTING -t nat -m comment --comment kubernetes postrouting rules -j KUBE-POSTROUTING]
I1026 06:39:50.990022       1 iptables.go:298] running iptables-save [-t filter]
I1026 06:39:50.992581       1 iptables.go:298] running iptables-save [-t nat]
I1026 06:39:50.995184       1 proxier.go:1244] Restoring iptables rules: *filter
:KUBE-SERVICES - [0:0]
-A KUBE-SERVICES -m comment --comment "kube-system/kube-dns:dns has no endpoints" -m udp -p udp -d 10.0.0.10/32 --dport 53 -j REJECT
-A KUBE-SERVICES -m comment --comment "kube-system/kube-dns:dns-tcp has no endpoints" -m tcp -p tcp -d 10.0.0.10/32 --dport 53 -j REJECT
COMMIT
`
 @errordeveloper, please refer the previous conversations,
#https://github.com/kubernetes/kubernetes/issues/34884

_Copied from original issue: kubernetes/kubernetes#35591_

documentatiocontent-gap kinsupport prioritbacklog

Source

mikedanese

Most helpful comment

Is there a solution/workaround for this yet?

I read about changing the UUID, how exactly should I do that?

davidcomeyne on 26 Jan 2017

👍3

All 13 comments

_From @luxas on October 26, 2016 16:49_

cc @kubernetes/sig-cluster-lifecycle

mikedanese on 22 Nov 2016

_From @soualid on November 13, 2016 22:34_

Got the following error - that may be related - in weave-kube when trying to setup a bare metal 3 nodes cluster running ubuntu 16 with kubeadm, got the same HWAddr assigned on master and worker node :

INFO: 2016/11/13 22:09:06.134487 ->[163.172.221.165:35727|ca:dd:16:be:df:42(sd-110872)]: connection shutting down due to error: local "ca:dd:16:be:df:42(sd-110872)" and remote "ca:dd:16:be:df:42(sd-100489)" peer names collision

Is there a way to force the renewal of the HWAddr ?

Setup informations :

kubeadm version

kubeadm version: version.Info{Major:"1", Minor:"5+", GitVersion:"v1.5.0-alpha.2.421+a6bea3d79b8bba", GitCommit:"a6bea3d79b8bbaa5e8b57482c9fff9265d402708", GitTreeState:"clean", BuildDate:"2016-11-03T06:54:50Z", GoVersion:"go1.7.1", Compiler:"gc", Platform:"linux/amd64"}

kubectl version

Client Version: version.Info{Major:"1", Minor:"4", GitVersion:"v1.4.4", GitCommit:"3b417cc4ccd1b8f38ff9ec96bb50a81ca0ea9d56", GitTreeState:"clean", BuildDate:"2016-10-21T02:48:38Z", GoVersion:"go1.6.3", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"4", GitVersion:"v1.4.4", GitCommit:"3b417cc4ccd1b8f38ff9ec96bb50a81ca0ea9d56", GitTreeState:"clean", BuildDate:"2016-10-21T02:42:39Z", GoVersion:"go1.6.3", Compiler:"gc", Platform:"linux/amd64"}

/etc/os-release

NAME="Ubuntu"
VERSION="16.04.1 LTS (Xenial Xerus)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 16.04.1 LTS"
VERSION_ID="16.04"
HOME_URL="http://www.ubuntu.com/"
SUPPORT_URL="http://help.ubuntu.com/"
BUG_REPORT_URL="http://bugs.launchpad.net/ubuntu/"
VERSION_CODENAME=xenial
UBUNTU_CODENAME=xenial

uname -a

Linux sd-110872 4.4.0-47-generic #68-Ubuntu SMP Wed Oct 26 19:39:52 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux

docker -v

Docker version 1.12.1, build 23cf638

mikedanese on 22 Nov 2016

_From @vganapathy1 on November 15, 2016 15:38_

@soualid Surprisingly, the UUID was same for all the VM's I had, which caused the name collision and changing the UUID resolved the issue.

To get UUID,
cat /sys/class/dmi/id/product_uuid

I had to get the UUID changed in all the VM's to get this work. If that doesn't work for you, you can check with @errordeveloper and he had a wave-kube patch which also worked for me.

mikedanese on 22 Nov 2016

_From @soualid on November 15, 2016 21:7_

@vganapathy1 thanks, I confirm that the machine UUID are equals on my machines, the boxes provider (online.net) must use a "clone" install system ("symantec ghost" like) that is not changing this UUID properly between boxes. I will contact them about this issue, but it could be great to be able to workaround this issue by overriding this value at runtime through a kubeadm parameter.

Thank you !

mikedanese on 22 Nov 2016

Is there a solution/workaround for this yet?

I read about changing the UUID, how exactly should I do that?

davidcomeyne on 26 Jan 2017

👍3

I ran in the very same issue with a raspi cluster running on Hypriot, all my nodes get the same HW address assigned in Weave:

k logs weave-net-x1z25 --namespace=kube-system weave
INFO: 2017/04/04 06:16:06.910766 Command line options: map[http-addr:127.0.0.1:6784 ipalloc-init:consensus=4 nickname:n3 status-addr:0.0.0.0:6782 docker-api: conn-limit:30 datapath:datapath ipalloc-range:10.32.0.0/12 no-dns:true port:6783]
INFO: 2017/04/04 06:16:06.911597 Communication between peers is unencrypted.
INFO: 2017/04/04 06:16:07.062159 Our name is 8e:0e:19:5d:4e:5e(n3)
INFO: 2017/04/04 06:16:07.062426 Launch detected - using supplied peer list: [192.168.23.200 192.168.23.201 192.168.23.202 192.168.23.203]
INFO: 2017/04/04 06:16:07.062669 Checking for pre-existing addresses on weave bridge
INFO: 2017/04/04 06:16:07.072861 [allocator 8e:0e:19:5d:4e:5e] No valid persisted data
INFO: 2017/04/04 06:16:07.158094 [allocator 8e:0e:19:5d:4e:5e] Initialising via deferred consensus
INFO: 2017/04/04 06:16:07.159120 Sniffing traffic on datapath (via ODP)
INFO: 2017/04/04 06:16:07.163257 ->[192.168.23.202:6783] attempting connection
INFO: 2017/04/04 06:16:07.163770 ->[192.168.23.201:6783] attempting connection
INFO: 2017/04/04 06:16:07.164761 ->[192.168.23.203:6783] attempting connection
INFO: 2017/04/04 06:16:07.165369 ->[192.168.23.200:6783] attempting connection
INFO: 2017/04/04 06:16:07.165999 ->[192.168.23.203:48229] connection accepted
INFO: 2017/04/04 06:16:07.173375 ->[192.168.23.203:6783|8e:0e:19:5d:4e:5e(n3)]: connection shutting down due to error: cannot connect to ourself
INFO: 2017/04/04 06:16:07.174156 ->[192.168.23.203:48229|8e:0e:19:5d:4e:5e(n3)]: connection shutting down due to error: cannot connect to ourself
INFO: 2017/04/04 06:16:07.185573 ->[192.168.23.202:6783|8e:0e:19:5d:4e:5e(n3)]: connection shutting down due to error: local "8e:0e:19:5d:4e:5e(n3)" and remote "8e:0e:19:5d:4e:5e(n2)" peer names collision
INFO: 2017/04/04 06:16:07.189360 Listening for HTTP control messages on 127.0.0.1:6784

rhuss on 4 Apr 2017

I think all we can do from a kubeadm perspective is document unique product UUIDs as a potential requirement. I imagine there are so many different ways to resolving this per OS, we can't really suggest a specific one.

jamiehannaford on 16 Jun 2017

Agreed @jamiehannaford, we should just list this as a requirement for everything running smoothly.
Kubernetes and things running on top might require/assume that
a) The product_uuid is unique
b) The MAC address is unique
for every node.

@jamiehannaford Can you document that please?

luxas on 17 Jun 2017

@luxas Sure, I'll get to it next week

jamiehannaford on 17 Jun 2017

Perfect, thank you!

luxas on 17 Jun 2017

This now documented, so we can close 🎉

jamiehannaford on 26 Jun 2017

👍1

Yayy :tada:

luxas on 26 Jun 2017

_От @ vganapathy1 15 ноября 2016 г. 15:38_

@soualid Удивительно, но UUID был одинаковым для всех виртуальных машин , которые у меня были, что вызвало конфликт имен и изменение UUID решило проблему.

Чтобы получить UUID,
cat /sys/class/dmi/id/product_uuid

Мне нужно было изменить UUID во всех виртуальных машинах, чтобы получить эту работу. Если это не работает для вас, вы можете проверить с @errordeveloper , и у него есть патч wave-kube, который также работает для меня.

Hi, @errordeveloper
Can you please give a patch for wave-kube to install weave with non-unique product_uuid?