Hi!
I have cluster: one master and two slave:
[root@node1 ~]# kubectl get nodes
NAME STATUS ROLES AGE VERSION
node1 Ready master,node 7h v1.11.2
node2 Ready node 7h v1.11.2
node3 Ready node 7h v1.11.2
When I try to add two additional nodes: master and slave I see following error:
ansible-playbook-2.7 -i inventory/mycluster/hosts.ini scale.yml -b -v --limit node4,node5
................
................
TASK [etcd : include_tasks] ***********************************************
Tuesday 04 September 2018 22:49:08 +0300 (0:00:00.414) 0:00:21.395 *
included: /root/projects/kuberspray-new/roles/etcd/tasks/gen_certs_script.yml for node4, node5
TASK [etcd : Gen_certs | create etcd cert dir] *****************************************
Tuesday 04 September 2018 22:49:09 +0300 (0:00:00.288) 0:00:21.684
fatal: [node4]: FAILED! => {"changed": false, "gid": 0, "group": "root", "mode": "0755", "msg": "chown failed: failed to look up user kube", "owner": "root", "path": "/etc/ssl/etcd", "size": 4096, "state": "directory", "uid": 0}
fatal: [node5]: FAILED! => {"changed": false, "gid": 0, "group": "root", "mode": "0755", "msg": "chown failed: failed to look up user kube", "owner": "root", "path": "/etc/ssl/etcd", "size": 4096, "state": "directory", "uid": 0}
NO MORE HOSTS LEFT **************************************************
to retry, use: --limit @/root/projects/kuberspray-new/scale.retry
PLAY RECAP ******************************************************
node4 : ok=17 changed=3 unreachable=0 failed=1
node5 : ok=15 changed=3 unreachable=0 failed=1
Please helm me!
When I tried:
ansible-playbook-2.7 -i inventory/mycluster/hosts.ini cluster.yml -b -v --limit node4,node5
I'l see error:
TASK [etcd : Configure | Ensure etcd-events is running] **************************************
Tuesday 04 September 2018 23:13:53 +0300 (0:00:00.744) 0:08:57.022
TASK [etcd : Configure | Check if etcd cluster is healthy] *************************************
Tuesday 04 September 2018 23:13:53 +0300 (0:00:00.186) 0:08:57.209
FAILED - RETRYING: Configure | Check if etcd cluster is healthy (4 retries left).
ok: [node4] => {"attempts": 1, "changed": false, "cmd": "/usr/local/bin/etcdctl --endpoints=https://159.69.156.5:2379,https://159.69.156.4:2379,https://159.69.8.218:2379,https://159.69.157.250:2379,https://159.69.146.137:2379 cluster-health | grep -q 'cluster is healthy'", "delta": "0:00:00.129497", "end": "2018-09-04 22:13:55.157243", "rc": 0, "start": "2018-09-04 22:13:55.027746", "stderr": "", "stderr_lines": [], "stdout": "", "stdout_lines": []}
FAILED - RETRYING: Configure | Check if etcd cluster is healthy (3 retries left).
FAILED - RETRYING: Configure | Check if etcd cluster is healthy (2 retries left).
FAILED - RETRYING: Configure | Check if etcd cluster is healthy (1 retries left).
fatal: [node5]: FAILED! => {"attempts": 4, "changed": false, "cmd": "/usr/local/bin/etcdctl --endpoints=https://159.69.156.5:2379,https://159.69.156.4:2379,https://159.69.8.218:2379,https://159.69.157.250:2379,https://159.69.146.137:2379 cluster-health | grep -q 'cluster is healthy'", "delta": "0:00:00.025962", "end": "2018-09-04 22:14:28.112431", "msg": "non-zero return code", "rc": 1, "start": "2018-09-04 22:14:28.086469", "stderr": "Error: open /etc/ssl/etcd/ssl/admin-node5.pem: no such file or directory", "stderr_lines": ["Error: open /etc/ssl/etcd/ssl/admin-node5.pem: no such file or directory"], "stdout": "", "stdout_lines": []}
NO MORE HOSTS LEFT **************************************************
to retry, use: --limit @/root/projects/kuberspray-new/cluster.retry
PLAY RECAP ******************************************************
node4 : ok=248 changed=57 unreachable=0 failed=0
node5 : ok=240 changed=57 unreachable=0 failed=1
node5 - master with etcd.
Cert on node5 not exist:
ls /etc/ssl/etcd/ssl/admin-node5.pem
ls: cannot access /etc/ssl/etcd/ssl/admin-node5.pem: No such file or directory
Did you try scale.yml playbook? AFAIK, cluster.yml playbook should be used for bootstrapping only.
I'm running into the same issue :)
I am also running into the same issue using the scale.yml playbook.
When users are being created from the etcd role, the the etcd user gets created, but not the kube user. The nodes originally provisioned with the cluster.yml playbook all have the kube user, so the scale.yml playbook is skipping over this user.
OS: Ubuntu 16.04
I was able to have a successful scale.yml run by modifying the etcd role to add the kube user. I can submit a PR if no one sees a problem with this solution.
kubespray/roles/etcd/meta/main.yml
---
dependencies:
- role: adduser
user: "{{ addusers.etcd }}"
when: not (ansible_os_family in ['CoreOS', 'Container Linux by CoreOS'] or is_atomic)
- role: adduser
user: "{{ addusers.kube }}"
when: not (ansible_os_family in ['CoreOS', 'Container Linux by CoreOS'] or is_atomic)
I was able to scale the cluster just by rerunning cluster.yml and changing
the inventory
On Fri, Nov 16, 2018 at 22:18 T.J. Telan notifications@github.com wrote:
I was able to have a successful scale.yml run by modifying the etcd role
to add the kube user. I can submit a PR if no one sees a problem with
this solution.kubespray/roles/etcd/meta/main.yml
dependencies:
- role: adduser
user: "{{ addusers.etcd }}"
when: not (ansible_os_family in ['CoreOS', 'Container Linux by CoreOS'] or is_atomic)- role: adduser
user: "{{ addusers.kube }}"
when: not (ansible_os_family in ['CoreOS', 'Container Linux by CoreOS'] or is_atomic)—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
https://github.com/kubernetes-incubator/kubespray/issues/3240#issuecomment-439530783,
or mute the thread
https://github.com/notifications/unsubscribe-auth/ADmCQJnf8yZMRZ4P153NwsU4cXbmjmmJks5uvyutgaJpZM4WZqxw
.
Same issue when using scale.yml
The user kube is not created in the new nodes added to inventory
I'm running into the same problem. Anything new on it? Any solution?
Just ran into the same issue on release-2.9 branch. Isn't scale.yml meant to be used for exactly the case when a freshly installed machine should be provisioned as a new node added to an existing cluster?
I have created a PR (https://github.com/kubernetes-sigs/kubespray/pull/4479) based on https://github.com/kubernetes-sigs/kubespray/issues/3240#issuecomment-439530783
Same thing here. I'm forced to run cluster.yml to workaround this.
Most helpful comment
I was able to have a successful scale.yml run by modifying the
etcdrole to add thekubeuser. I can submit a PR if no one sees a problem with this solution.kubespray/roles/etcd/meta/main.yml