Kubespray: bash: /usr/local/bin/etcd-scripts/make-ssl-etcd.sh: No such file or directory

Created on 15 May 2019  ·  28Comments  ·  Source: kubernetes-sigs/kubespray

Environment:

  • Cloud provider or hardware configuration:

Bare metal 2 nodes via SSH bastion

  • OS (printf "$(uname -srm)\n$(cat /etc/os-release)\n"):
printf "$(uname -srm)\n$(cat /etc/os-release)\n"
Linux 4.9.0-9-amd64 x86_64
PRETTY_NAME="Debian GNU/Linux 9 (stretch)"
NAME="Debian GNU/Linux"
VERSION_ID="9"
VERSION="9 (stretch)"
ID=debian
HOME_URL="https://www.debian.org/"
SUPPORT_URL="https://www.debian.org/support"
BUG_REPORT_URL="https://bugs.debian.org/"
  • Version of Ansible (ansible --version):
root@localhost:~/kubespray# ansible --version
ansible 2.7.10
  config file = /root/kubespray/ansible.cfg
  configured module search path = ['/root/kubespray/library']
  ansible python module location = /usr/local/lib/python3.5/dist-packages/ansible
  executable location = /usr/local/bin/ansible
  python version = 3.5.3 (default, Sep 27 2018, 17:25:39) [GCC 6.3.0 20170516]

Kubespray version (commit) (git rev-parse --short HEAD):

root@localhost:~/kubespray# git rev-parse --short HEAD
3f62492a

Network plugin used:

Copy of your inventory file:

Command used to invoke ansible:

Output of ansible run:
https://gist.github.com/chrissound/147bfdc6e54e3efb1dc664261ea87410

Anything else do we need to know:


TASK [etcd : Gen_certs | run cert generation script] *********************************************************************************************************************************
Wednesday 15 May 2019  17:27:36 +0000 (0:00:00.147)       0:17:19.661 ********* 
fatal: [node2 -> 10.40.45.102]: FAILED! => {"changed": true, "cmd": ["bash", "-x", "/usr/local/bin/etcd-scripts/make-ssl-etcd.sh", "-f", "/etc/ssl/etcd/openssl.conf", "-d", "/etc/ssl/etcd/ssl"], "delta": "0:00:00.004018", "end": "2019-05-15 17:27:37.398033", "msg": "non-zero return code", "rc": 127, "start": "2019-05-15 17:27:37.394015", "stderr": "bash: /usr/local/bin/etcd-scripts/make-ssl-etcd.sh: No such file or directory", "stderr_lines": ["bash: /usr/local/bin/etcd-scripts/make-ssl-etcd.sh: No such file or directory"], "stdout": "", "stdout_lines": []}

kinbug lifecyclrotten

Most helpful comment

Hello, I had the same error, but it was because a worker had to handle the certs because the master couldn't because of a stupid cache problem. So I ran the deploy command with: --flush-cache
ansible-playbook --flush-cache -i inventory/mycluster/inventory.yml cluster.yml

Hope that helps.

All 28 comments

You have a failure in the task before
TASK [kubernetes/preinstall : Stop if access_ip is not pingable] **************************************
Wednesday 15 May 2019 17:11:29 +0000 (0:00:00.289) 0:01:13.066
**
changed: [node2]
fatal: [node1]: FAILED! => {"changed": true, "cmd": ["ping", "-c1", "10.40.45.102"], "delta": "0:00:10.018781", "end": "2019-05-15 17:11:40.711070", "msg": "non-zero return code", "rc": 1, "start": "2019-05-15 17:11:30.692289", "stderr": "", "stderr_lines": [], "stdout": "PING 10.40.45.102 (10.40.45.102) 56(84) bytes of data.\n\n--- 10.40.45.102 ping statistics ---\n1 packets transmitted, 0 received, 100% packet loss, time 0ms", "stdout_lines": ["PING 10.40.45.102 (10.40.45.102) 56(84) bytes of data.", "", "--- 10.40.45.102 ping statistics ---", "1 packets transmitted, 0 received, 100% packet loss, time 0ms"]}

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

I have the same issue on CentOS 7:
"stderr": "bash: /usr/local/bin/etcd-scripts/make-ssl-etcd.sh: No such file or directory"
It really does not exist.

/remove-lifecycle stale

seems to happen if kube_api_anonymous_auth is set to false

Same porblem for me with CoreOS
"changed": true, "cmd": [ "bash", "-x", "/opt/bin/etcd-scripts/make-ssl-etcd.sh", "-f", "/etc/ssl/etcd/openssl.conf", "-d", "/etc/ssl/etcd/ssl" ], "delta": "0:00:00.012738", "end": "2019-09-12 14:09:58.305549", "invocation": { "module_args": { "_raw_params": "bash -x /opt/bin/etcd-scripts/make-ssl-etcd.sh -f /etc/ssl/etcd/openssl.conf -d /etc/ssl/etcd/ssl", "_uses_shell": false, "argv": null, "chdir": null, "creates": null, "executable": null, "removes": null, "stdin": null, "warn": true } }, "msg": "non-zero return code", "rc": 127, "start": "2019-09-12 14:09:58.292811", "stderr": "bash: /opt/bin/etcd-scripts/make-ssl-etcd.sh: No such file or directory", "stderr_lines": [ "bash: /opt/bin/etcd-scripts/make-ssl-etcd.sh: No such file or directory" ], "stdout": "", "stdout_lines": [] }

Same things happens here too

["bash", "-x", "/usr/local/bin/etcd-scripts/make-ssl-etcd.sh", "-f", "/etc/ssl/etcd/openssl.conf", "-d", "/etc/ssl/etcd/ssl"], "delta": "0:00:00.004172", "end": "2019-09-19 13:43:56.250647", "msg": "non-zero return code", "rc": 127, "start": "2019-09-19 13:43:56.246475", "stderr": "bash: /usr/local/bin/etcd-scripts/make-ssl-etcd.sh: No such file or directory", "stderr_lines": ["bash: /usr/local/bin/etcd-scripts/make-ssl-etcd.sh: No such file or directory"], "stdout": "", "stdout_lines": []}

Same here on ubuntu 18.04

To make it work, It might be silly, but I checkout to the tag 2.10.4. On that tag It works for me.

To make it work, It might be silly, but I checkout to the tag 2.10.4. On that tag It works for me.

Nope, it doesn't do it on my cluster. Still same error. Thank you anyways

Same thing happened to me.
fatal: [w1-k8s -> 192.168.1.11]: FAILED! => {"changed": true, "cmd": ["bash", "-x", "/usr/local/bin/etcd-scripts/make-ssl-etcd.sh", "-f", "/etc/ssl/etcd/openssl.conf", "-d", "/etc/ssl/etcd/ssl"], "delta": "0:00:00.004744", "end": "2019-09-26 10:53:35.264927", "msg": "non-zero return code", "rc": 127, "start": "2019-09-26 10:53:35.260183", "stderr": "bash: /usr/local/bin/etcd-scripts/make-ssl-etcd.sh: 그런 파일이나 디렉터리가 없습니다", "stderr_lines": ["bash: /usr/local/bin/etcd-scripts/make-ssl-etcd.sh: 그런 파일이나 디렉터리가 없습니다"], "stdout": "", "stdout_lines": []}

In my case it manages to fail somewhere else if I add the master to the etcd set, but still it's not clear what the issue could be

I've managed to make it work by running the playbook one host at a time.

hi, opened a PR to fix this issue, if interested, apply the patch before community merge that

https://github.com/kubernetes-sigs/kubespray/pull/5368

By the way, I fixed increase memory size for target node from 1GB to 1.5GB as workaround. hopefully it will help to other if someone have still problem.

For now I got passed this by just making sure the node that its selecting to work on first is indeed an etcd node in the inventory (You can also just make them all etcd nodes if its small) until this is resolved.

Hello, I had the same error, but it was because a worker had to handle the certs because the master couldn't because of a stupid cache problem. So I ran the deploy command with: --flush-cache
ansible-playbook --flush-cache -i inventory/mycluster/inventory.yml cluster.yml

Hope that helps.

seems to happen if kube_api_anonymous_auth is set to false
if kube_api_anonymous_auth is set to false ,will not download binary or tgz files,and throw this exception,why?

This is happening to me with the latest 2.12.0 when I set kube_api_anonymous_auth to false. We really need to fix the kube_api_anonymous_auth, enterprises are freaking out over this being default enabled, thus breaking playbooks when it's not set to true.

Just happened to me with the v2.12.1 release tag today...

So I can get around this issue with these settings (Deploying to Ubuntu 18.04 Servers):

kube_api_anonymous_auth: false
kube_basic_auth: true
kube_apiserver_insecure_port: 8080

If kube_api_anonymous_auth is set to false, you need to specify a port other than 0 for kube_apiserver_insecure_port. Also I'm not sure if it did anything different but I figured setting kube_basic_auth: true would be a good idea (or one of the other options for auth).

By the way, I fixed increase memory size for target node from 1GB to 1.5GB as workaround. hopefully it will help to other if someone have still problem.

Works for me!In my case,I gave masters 2GB memory other than 1.5GB and then the problem is gone.Before I did this I saw 'assertion failed' log about node resource not enough but the setup just continued without any interruption until this error occurred.

We have ansible error in assert tasks
see PR #5676

Observing same issue when cloud_provider is set to "azure", If cloud provider is not set, this issue is not observed.

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

@fejta-bot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Was this page helpful?
0 / 5 - 0 ratings