Environmental Info:
K3s Version:
k3s version v1.19.0+k3s-9ac113de (9ac113de)
Node(s) CPU architecture, OS, and Version:
Linux ip-172-31-33-134 5.4.0-1021-aws #21-Ubuntu SMP Fri Jul 24 09:42:29 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
Cluster Configuration:
single server node
Describe the bug:
Attempting to build from master commit is not working due to something with embedded etcd and/or snapshot/backup/restore. Before the snapshot/backup/restore code was pushed this was working.
Working commit id tested: 719ffbfb2742eb057fa1f2eefca08d9053bc9a39
Non-working commit id tested: 9ac113de4c79b5b30f90b97363fe608a77d97ac4
Steps To Reproduce:
Both of the following two install methods fail with the same error:
curl -sfL https://get.k3s.io | INSTALL_K3S_COMMIT=9ac113de4c79b5b30f90b97363fe608a77d97ac4 INSTALL_K3S_EXEC="--cluster-init" sh -curl -sfL https://get.k3s.io | INSTALL_K3S_COMMIT=9ac113de4c79b5b30f90b97363fe608a77d97ac4 INSTALL_K3S_EXEC="--datastore-endpoint etcd --etcd-snapshot-retention 7 --etcd-snapshot-schedule-cron '*/5 * * * *'" sh - Expected behavior:
Actual behavior:
Additional context / logs:
journalctl -eu k3s -f:Sep 03 17:12:45 ip-172-31-33-134 k3s[2154]: WARNING: 2020/09/03 17:12:45 grpc: addrConn.createTransport failed to connect to {https://127.0.0.1:2379 <nil> 0 <nil>}. Err :connection error: desc = "transport: Error while dialing dial tcp 127.0.0.1:2379: connect: connection refused". Reconnecting...
Sep 03 17:12:49 ip-172-31-33-134 k3s[2154]: WARNING: 2020/09/03 17:12:49 grpc: addrConn.createTransport failed to connect to {https://127.0.0.1:2379 <nil> 0 <nil>}. Err :connection error: desc = "transport: Error while dialing dial tcp 127.0.0.1:2379: connect: connection refused". Reconnecting...
Sep 03 17:12:50 ip-172-31-33-134 k3s[2154]: {"level":"warn","ts":"2020-09-03T17:12:50.621Z","caller":"clientv3/retry_interceptor.go:62","msg":"retrying of unary invoker failed","target":"passthrough:///https://127.0.0.1:2379","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = latest balancer error: connection error: desc = \"transport: Error while dialing dial tcp 127.0.0.1:2379: connect: connection refused\""}
Sep 03 17:12:50 ip-172-31-33-134 k3s[2154]: time="2020-09-03T17:12:50.621461714Z" level=info msg="Failed to test data store connection: context deadline exceeded"
Sep 03 17:12:55 ip-172-31-33-134 k3s[2154]: WARNING: 2020/09/03 17:12:55 grpc: addrConn.createTransport failed to connect to {https://127.0.0.1:2379 <nil> 0 <nil>}. Err :connection error: desc = "transport: Error while dialing dial tcp 127.0.0.1:2379: connect: connection refused". Reconnecting...
--etcd-snapshot-schedule-cron value flag does not fix this.Can confirm that this happens with the current HEAD (ie. f72d39ad9cced43f61506f2a66e63031a0ee2072).
It used to work fine with 30f672b72a4b7e3a51a52aaf724dfc325d82728d
Nobody have this problem? https://github.com/rancher/k3s/issues/2131
v1.19.1-rc1+k3s1--cluster-init flag: curl -sfL https://get.k3s.io | INSTALL_K3S_VERSION=v1.19.1-rc1+k3s1 INSTALL_K3S_EXEC="--cluster-init" sh - and embedded etcd will be used by defaultetcd-disable-snapshots, etcd-snapshot-dir, etcd-snapshot-schedule-cron and etcd-snapshot-retention have all been validated)--cluster-reset and --cluster-reset-restore-path also have been validated Was running into a similar issue when starting up the second node "Error while dialing dial tcp 127.0.0.1:2379", turned out that I had to open 2379 and 2380 ports in the firewall.
Was running into a similar issue when starting up the second node "Error while dialing dial tcp 127.0.0.1:2379", turned out that I had to open 2379 and 2380 ports in the firewall.
Opening up 2379 and 2380 for my Security Groups worked for me. Which makes sense because those are the official etcd ports.
The official etcd ports are 2379 for client requests and 2380 for peer communication.
Steps:
# Grab token
cat /var/lib/rancher/k3s/server/node-token
```
# Install K3s Server
curl -sfL https://get.k3s.io | INSTALL_K3S_VERSION="v1.19.3+k3s1" sh -s - --server https://<node_1_ip>:6443
```
ubuntu@usw1-k3s-control2:~$ sudo kubectl get nodes
NAME STATUS ROLES AGE VERSION
usw1-k3s-control1 Ready etcd,master 10m v1.19.3+k3s1
usw1-k3s-control2 Ready etcd,master 114s v1.19.3+k3s1
Yes, we need to add the etcd ports to the docs @davidnuzik
Hi guys,
Do note that I also had to add the IP/hostname to the TLS san, like that:
curl -sfL https://get.k3s.io | INSTALL_K3S_VERSION="v1.19.3+k3s2" sh -s - --cluster-init --tls-san 10.0.0.2 --node-ip 10.0.0.2 --node-external-ip 12.34.56.78
curl -sfL https://get.k3s.io | INSTALL_K3S_VERSION="v1.19.3+k3s2" sh -s - --server https://10.0.0.2:6443 --tls-san 10.0.0.3 --node-ip 10.0.0.3 --node-external-ip 12.34.56.79
Most helpful comment
Was running into a similar issue when starting up the second node "Error while dialing dial tcp 127.0.0.1:2379", turned out that I had to open 2379 and 2380 ports in the firewall.