Kubespray: Error with Calico wait for etcd

Created on 12 Dec 2017 · 7Comments · Source: kubernetes-sigs/kubespray

Is this a BUG REPORT or FEATURE REQUEST? (choose one):
Bug report

Environment:

Cloud provider or hardware configuration:
3 bare metal nodes

OS (printf "$(uname -srm)\n$(cat /etc/os-release)\n"):

Linux 4.4.0-103-generic x86_64
NAME="Ubuntu"
VERSION="16.04.3 LTS (Xenial Xerus)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 16.04.3 LTS"
VERSION_ID="16.04"
HOME_URL="http://www.ubuntu.com/"
SUPPORT_URL="http://help.ubuntu.com/"
BUG_REPORT_URL="http://bugs.launchpad.net/ubuntu/"
VERSION_CODENAME=xenial
UBUNTU_CODENAME=xenial

Version of Ansible (ansible --version):

ansible-playbook 2.4.2.0
  config file = /Users/focaaby/Project/kubespray/ansible.cfg
  configured module search path = [u'/Users/focaaby/Project/kubespray/library']
  ansible python module location = /usr/local/Cellar/ansible/2.4.2.0_1/libexec/lib/python2.7/site-packages/ansible
  executable location = /usr/local/bin/ansible-playbook
  python version = 2.7.14 (default, Sep 25 2017, 09:54:19) [GCC 4.2.1 Compatible Apple LLVM 9.0.0 (clang-900.0.37)]

Kubespray version (commit) (git rev-parse --short HEAD):
79417e07

Network plugin used:
Calico

Copy of your inventory file:

[all]
node1 ansible_host=192.168.2.196 ip=192.168.2.196
node2 ansible_host=192.168.2.191 ip=192.168.2.191
node3 ansible_host=192.168.2.192 ip=192.168.2.192

[kube-master]
node1
node2

[kube-node]
node1
node2
node3

[etcd]
node1
node2
node3

[k8s-cluster:children]
kube-node
kube-master

[calico-rr]

[vault]
node1
node2
node3

Command used to invoke ansible:
ansible-playbook --flush-cache -i inventory/inventory.cfg cluster.yml -b -vvvv -u prlab -K

Output of ansible run:
https://gist.githubusercontent.com/focaaby/fca02a271c6c63176b486e07908851ed/raw/722c3526d83f9c790bc8be1bc73187e97a5c0735/gistfile1.txt

I had found similar issue https://github.com/kubernetes-incubator/kubespray/issues/1466, after I tried --flush-cache flag and removed all files /tmp directory, it still same error.

lifecyclrotten

Source

focaaby

😕2

Most helpful comment

Hello,
I am also facing this issue. Is there any work around to avoid this issue? Or any work is going on?
Please update.

sahil-sharma on 20 Jun 2018

👍2

All 7 comments

I have the same issue, how did you solve it?

wtwde on 1 Mar 2018

👍1

@wtwde Sorry, I did not solve it. I just changed other method to deploy.

focaaby on 1 Mar 2018

👍1

Hello,
I am also facing this issue. Is there any work around to avoid this issue? Or any work is going on?
Please update.

sahil-sharma on 20 Jun 2018

👍2

It seems you have non zero swap on node1 and node2:

fatal: [node1]: FAILED! => {
    "assertion": "ansible_swaptotal_mb == 0", 
    "changed": false, 
    "evaluated_to": false
}
fatal: [node2]: FAILED! => {
    "assertion": "ansible_swaptotal_mb == 0", 
    "changed": false, 
    "evaluated_to": false
}

redixin on 2 Jul 2018

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten

fejta-bot on 10 Apr 2019

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

fejta-bot on 10 May 2019

@fejta-bot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.