Version:
k3s version v1.17.2+k3s1 (cdab19b0)
Describe the bug
three nodes ha cluster
kubectl get node
NAME STATUS ROLES AGE VERSION
cbo1 Ready master 38m v1.17.2+k3s1
cbo3 Ready master 35m v1.17.2+k3s1
cbo2 Ready master 37m v1.17.2+k3s1
cbo1 is the first server
cbo2 and cbo3 joined to him
when either cbo2 or cbo3 are down - cluster is up
get node
NAME STATUS ROLES AGE VERSION
cbo1 Ready master 42m v1.17.2+k3s1
cbo2 Ready master 41m v1.17.2+k3s1
cbo3 NotReady master 38m v1.17.2+k3s1
however when cbo1 is down
kubectl get node
Error from server: rpc error: code = Unknown desc = failed to create dqlite connection: no available dqlite leader server found
To Reproduce
cbo1
/usr/local/bin/k3s server --cluster-init --tls-san cbo1.ip --flannel-backend ipsec
cbo2 and cbo3
/usr/local/bin/k3s server --server https://cbo1.ip:6443 --flannel-backend ipsec
Expected behavior
kubectl get node
NAME STATUS ROLES AGE VERSION
cbo1 NotReady master 38m v1.17.2+k3s1
cbo3 Ready master 35m v1.17.2+k3s1
cbo2 Ready master 37m v1.17.2+k3s1
Actual behavior
kubectl get node
Error from server: rpc error: code = Unknown desc = failed to create dqlite connection: no available dqlite leader server found
Additional context
Is your k3s.yaml pointed at just the one node, or does it point at an address or hostname that will remain available when the first node is down? You might have multiple masters but it doesn't do you much good if you can't talk to any of them but the first one.
thanks, @brandond. I didn't think it is worth to mention
k3s.yaml points to all three servers
technically there re three A records in the DNS with 1s TTL
but I also addressed each one of the active servers directly - the result is the same
however, if I address the first server that is completely down I get
Unable to connect to the server: dial tcp cbo1.ip:6443: connect: operation timed out
and if I just stop k3s
The connection to the server cbo1.ip:6443 was refused - did you specify the right host or port?
that's as expected
What do the logs say?
I built a 3 node dqlite cluster on Ubuntu in VirtualBox just a day or two ago so I know it should work, mostly. CPU usage seemed to be unusual high so I went back to single master.
I never said the cluster is not working
I only said that it cannot find dqlite leader when the first server is down
the most interesting thing in the logs seems to be
ClientConnection failed: dial unix /var/run/charon.vici: connect: connection refused
I can see it every second, even if the cluster is up
Feb 07 07:52:43 cbo2 k3s[24261]: E0207 07:52:43.717128 24261 available_controller.go:419] v1beta1.metrics.k8s.io failed with: failing or missing response from https://10.43.30.222:443/apis/metrics.k8s.io/v1beta1: Get https://10.43.30.222:443/apis/metrics.k8s.io/v1beta1: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
Feb 07 07:52:43 cbo2 k3s[24261]: E0207 07:52:43.778638 24261 handle_charon.go:84] ClientConnection failed: dial unix /var/run/charon.vici: connect: connection refused
Feb 07 07:52:43 cbo2 k3s[24261]: I0207 07:52:43.778679 24261 handle_charon.go:87] Retrying in a second ...
Feb 07 07:52:44 cbo2 k3s[24261]: E0207 07:52:44.647233 24261 controller.go:114] loading OpenAPI spec for "v1beta1.metrics.k8s.io" failed with: failed to retrieve openAPI spec, http error: ResponseCode: 503, Body: Error trying to reach service: 'dial tcp 10.43.30.222:443: i/o timeout', Header: map[Content-Type:[text/plain; charset=utf-8] X-Content-Type-Options:[nosniff]]
Feb 07 07:52:44 cbo2 k3s[24261]: I0207 07:52:44.647272 24261 controller.go:127] OpenAPI AggregationController: action for item v1beta1.metrics.k8s.io: Rate Limited Requeue.
Feb 07 07:52:44 cbo2 k3s[24261]: E0207 07:52:44.779024 24261 handle_charon.go:84] ClientConnection failed: dial unix /var/run/charon.vici: connect: connection refused
Feb 07 07:52:44 cbo2 k3s[24261]: I0207 07:52:44.779067 24261 handle_charon.go:87] Retrying in a second ...
Feb 07 07:52:45 cbo2 k3s[24261]: E0207 07:52:45.779385 24261 handle_charon.go:84] ClientConnection failed: dial unix /var/run/charon.vici: connect: connection refused
Feb 07 07:52:45 cbo2 k3s[24261]: I0207 07:52:45.779430 24261 handle_charon.go:87] Retrying in a second ...
Feb 07 07:52:46 cbo2 k3s[24261]: E0207 07:52:46.779827 24261 handle_charon.go:84] ClientConnection failed: dial unix /var/run/charon.vici: connect: connection refused
Feb 07 07:52:46 cbo2 k3s[24261]: I0207 07:52:46.779877 24261 handle_charon.go:87] Retrying in a second ...
Feb 07 07:52:47 cbo2 k3s[24261]: E0207 07:52:47.780244 24261 handle_charon.go:84] ClientConnection failed: dial unix /var/run/charon.vici: connect: connection refused
Feb 07 07:52:47 cbo2 k3s[24261]: I0207 07:52:47.780289 24261 handle_charon.go:87] Retrying in a second ...
Feb 07 07:52:48 cbo2 k3s[24261]: E0207 07:52:48.722943 24261 available_controller.go:419] v1beta1.metrics.k8s.io failed with: failing or missing response from https://10.43.30.222:443/apis/metrics.k8s.io/v1beta1: Get https://10.43.30.222:443/apis/metrics.k8s.io/v1beta1: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
Feb 07 07:52:48 cbo2 k3s[24261]: E0207 07:52:48.780575 24261 handle_charon.go:84] ClientConnection failed: dial unix /var/run/charon.vici: connect: connection refused
Feb 07 07:52:48 cbo2 k3s[24261]: I0207 07:52:48.780622 24261 handle_charon.go:87] Retrying in a second ...
Feb 07 07:52:49 cbo2 k3s[24261]: E0207 07:52:49.780820 24261 handle_charon.go:84] ClientConnection failed: dial unix /var/run/charon.vici: connect: connection refused
Feb 07 07:52:49 cbo2 k3s[24261]: I0207 07:52:49.780852 24261 handle_charon.go:87] Retrying in a second ...
Feb 07 07:52:50 cbo2 k3s[24261]: E0207 07:52:50.781207 24261 handle_charon.go:84] ClientConnection failed: dial unix /var/run/charon.vici: connect: connection refused
Feb 07 07:52:50 cbo2 k3s[24261]: I0207 07:52:50.781250 24261 handle_charon.go:87] Retrying in a second ...
Feb 07 07:52:51 cbo2 k3s[24261]: I0207 07:52:51.353469 24261 controller.go:107] OpenAPI AggregationController: Processing item v1beta1.metrics.k8s.io
when the first server is down it adds information on failing dqlite
Feb 07 07:58:10 cbo2 k3s[24261]: time="2020-02-07T07:58:10.550290916Z" level=error msg="error while range on /registry/storageclasses /registry/storageclasses: failed to create dqlite connection: no available dqlite leader server found"
Feb 07 07:58:10 cbo2 k3s[24261]: time="2020-02-07T07:58:10.812393865Z" level=info msg="Connecting to proxy" url="wss://167.86.89.20:6443/v1-k3s/connect"
Feb 07 07:58:10 cbo2 k3s[24261]: time="2020-02-07T07:58:10.813253780Z" level=error msg="Failed to connect to proxy" error="dial tcp 167.86.89.20:6443: connect: connection refused"
Feb 07 07:58:10 cbo2 k3s[24261]: time="2020-02-07T07:58:10.813287562Z" level=error msg="Remotedialer proxy error" error="dial tcp 167.86.89.20:6443: connect: connection refused"
Feb 07 07:58:11 cbo2 k3s[24261]: E0207 07:58:11.108462 24261 handle_charon.go:84] ClientConnection failed: dial unix /var/run/charon.vici: connect: connection refused
Feb 07 07:58:11 cbo2 k3s[24261]: I0207 07:58:11.108506 24261 handle_charon.go:87] Retrying in a second ...
Feb 07 07:58:12 cbo2 k3s[24261]: E0207 07:58:12.108723 24261 handle_charon.go:84] ClientConnection failed: dial unix /var/run/charon.vici: connect: connection refused
Feb 07 07:58:12 cbo2 k3s[24261]: I0207 07:58:12.108766 24261 handle_charon.go:87] Retrying in a second ...
Feb 07 07:58:12 cbo2 k3s[24261]: time="2020-02-07T07:58:12.333153888Z" level=error msg="error while range on /registry/leases /registry/leases: failed to create dqlite connection: no available dqlite leader server found"
Feb 07 07:58:13 cbo2 k3s[24261]: time="2020-02-07T07:58:13.085579967Z" level=error msg="error while range on /registry/clusterrolebindings /registry/clusterrolebindings: failed to create dqlite connection: no available dqlite leader server found"
Feb 07 07:58:13 cbo2 k3s[24261]: E0207 07:58:13.108975 24261 handle_charon.go:84] ClientConnection failed: dial unix /var/run/charon.vici: connect: connection refused
Feb 07 07:58:13 cbo2 k3s[24261]: I0207 07:58:13.109016 24261 handle_charon.go:87] Retrying in a second ...
Feb 07 07:58:14 cbo2 k3s[24261]: E0207 07:58:14.109329 24261 handle_charon.go:84] ClientConnection failed: dial unix /var/run/charon.vici: connect: connection refused
Feb 07 07:58:14 cbo2 k3s[24261]: I0207 07:58:14.109392 24261 handle_charon.go:87] Retrying in a second ...
Feb 07 07:58:14 cbo2 k3s[24261]: time="2020-02-07T07:58:14.399545671Z" level=error msg="error while range on /registry/csinodes /registry/csinodes: failed to create dqlite connection: no available dqlite leader server found"
Feb 07 07:58:14 cbo2 k3s[24261]: time="2020-02-07T07:58:14.482484472Z" level=error msg="error while range on /registry/minions /registry/minions: failed to create dqlite connection: no available dqlite leader server found"
oh, and now it seems I have a different message returned by kubectl
kubectl get node
Error from server (ServiceUnavailable): the server is currently unable to handle the request
How long did you wait after adding the other servers before shutting down the first master? Is it possible that it hadn't finished replicating all the data and reaching quorum? Unfortunately the various k3s controllers don't export any metrics or write many non-error logs so it's quite hard to figure out what's going on, but I wonder if dqlite wasn't ready to run without the master when you shut it down.
Hey ! I'm also having some issues by stress testings the nodes in HA embedded with 3 masters and 2 workers.
There is one open PR that seems to be related: https://github.com/rancher/k3s/pull/1377
I am facing the exact same issue. Kinda defies the purpose of HA entirely as the first node becomes single point of failure.
I cannot add much more info than @eleaner already did. my logs look pretty much the same馃槙
I see the same issues with my HA cluster, raft gets really upset when you start knocking out master nodes, especially if one happened to be the leader at the time.
I am also experiencing the same problem on a 2 master ARM cluster. I even have a floating IP using heartbeat so the advertised IP of the cluster has moved over to Master2. I attempted to stop/start k3s on Master2 but it hangs and never successfully resumes. After starting the service back up on Master1 I can start the service on Master2.
@castironclay 2 master nodes does not have a quorum, please see:
https://github.com/rancher/k3s/issues/1407
I have hit this bug in my k3s setup in DO. I have 2 nodes in EU and 1 in US. I started the cluster with the node in US, and then added the other two nodes.
After stopping k3s service in US node, cluster became unresponsive, and although after restarting the service a few times workloads have shifted to the other 2 nodes, it seems that dqlite is still looking for the US node as a leader, and is incapable of handing over the leader role.
Is there any information you are currently lacking?
I have something similar with the latest version but exactly the same. I start 3 node master with vagrant. First start node1 wait to boot then node2 and 3.
When I shutdown node1 the nodes 2 and 3 keep working but soon as I shutdown node 2 then on node 3 I get this error every time I call kubectl. "failed to create dqlite connection: no available dqlite leader server found"
If I only have 2 nodes for example node 1 and 2 and I shutdown node 1 the same happens
Ran into this as well, take down the initial node, the remaining masters are unable to determine a leader. I feel like this might be a problem with the documentation as the quickstart for dqlite is different from the rest of the multi-master configs because there is no fixed registration address, just instructions to point other nodes to the initial node by IP address. The remaining server processes are unable to mesh because they all point to one master and can't "figure it out".
@brontide I also noticed the different documentation so i setup a virtual IP with keepalived and pointed the individual nodes to that IP upon cluster creation, however the behaviour stayed the same, even though the IP switched to another node when i took the first node offline. the nodes fail to find a leader..
@philkry yeah, nodes would not even come up. Pretty clear that the bootstrapping of the dqlite cHA cluster is lacking. I've gone back to single master for now just to get off the ground.
@brontide @philkry I can confirm, I set up three master nodes, with a loadbalancer in front of them, used as virtual ip to join them. Problem persists then, too.
I tried working around this by chaining the master nodes (master3 -> master2 -> master1) and it failed completely, dqlite was continuously restarting on each node shortly after I connected the third node.
Actually I was hoping to make a ring of 5 master nodes this way, but it was already broken with three nodes so I gave up.
I'm hoping that they eventually fix the underlying bootstrapping issue in the meantime I've switched back to single master and looking to backup the db via script in case I ever need to rebuild or replace the master. Seems like they have a good idea but the execution is still lacking.
Update 11 May 2020: still checking out if HA is indeed true by analyzing log files and testing.
Update: unsure if fully solved with v1.18.2+k3s1
sudo curl -sfL https://get.k3s.io | INSTALL_K3S_CHANNEL=latest INSTALL_K3S_EXEC="--flannel-backend=wireguard" sh -I also have the same issue (/w v1.17.4+k3s1).
Question:
What is or should be the proper k3s.yaml config in a three node master HA with DQLite? Can't seem to find the proper location for documentation.
My k3s.yaml on all three master nodes (they are identical):
apiVersion: v1
clusters:
- cluster:
certificate-authority-data: <removed>
server: https://127.0.0.1:6443
name: default
contexts:
- context:
cluster: default
user: default
name: default
current-context: default
kind: Config
preferences: {}
users:
- name: default
user:
password: <removed>
username: admin
Isn't the _--server part_ the problem? All my other subsequent master nodes starts/executes with:
ExecStart=/usr/local/bin/k3s server --flannel-backend=wireguard --server=https://k3s-mn-01.biz:6443 --token=<removed>
This is according to https://rancher.com/docs/k3s/latest/en/installation/ha-embedded/
My first master node starts with:
ExecStart=/usr/local/bin/k3s server --flannel-backend=wireguard
I can confirm a new cluster running v1.18.2+k3s1 on 3 Pi 4s exhibits this issue. I spun this up using k3sup but all the behaviours seem the same
Like @remkolems I'm not sure how later nodes should be joined in terms of configuration. The init script starts k3s with --server https://192.168.0.103:6443 (the first/initial node with --cluster-init). Does that mean it only ever looks for this node?
I would guess not, as /var/lib/rancher/k3s/server/db/state.dqlite/peers.db contains all the other peers, and logs talk about connecting WebSockets to other peers but nothing seems immediately dqlite-specific otherwise (in terms of reporting status etc).
(Also at the time of writing my node 2 appears to be whining about not finding node 3, all after rebooting node 1 - level=warning msg="no known leader address=192.168.0.105:6443 attempt=0". A simple find in the codebase shows this is from dqlite itself. kubectl get nodes reports everything as good though.)
I can confirm a new cluster running v1.18.2+k3s1 on 3 Pi 4s exhibits this issue. I spun this up using
k3supbut all the behaviours seem the sameLike @remkolems I'm not sure how later nodes should be joined in terms of configuration. The init script starts k3s with
--server https://192.168.0.103:6443(the first/initial node with--cluster-init). Does that mean it only ever looks for this node?I would guess not, as
/var/lib/rancher/k3s/server/db/state.dqlite/peers.dbcontains all the other peers, and logs talk about connecting WebSockets to other peers but nothing seems immediately dqlite-specific otherwise (in terms of reporting status etc).(Also at the time of writing my node 2 appears to be whining about not finding node 3, all after rebooting node 1 -
level=warning msg="no known leader address=192.168.0.105:6443 attempt=0". A simple find in the codebase shows this is from dqlite itself.kubectl get nodesreports everything as good though.)
Just tested by restarting my first node, during restart/after successful restart of node (and services) used the following command multiple times on all other master nodes:
sudo kubectl get nodes -o wide
I still get the expected results (Ready/NotReady/Ready). Although, I didn't dive into the log files yet.
At what log files _should_ we look at precisely to validate k3s?

Even after restarting my second node, I get the above result.
System specification
UP Squared ATOM Quad Core (memory 4GB / eMMC 32GB)

behaviour changed for me with v1.18.2+k3s1 as well, but still no HA.
when i restart the first node, the cluster reassembles successfully after the node is back up (was not the case with previous version), but during the restart the control plane was not available.
during the restart logs are flooded with
May 11 10:06:25 node3 k3s[21207]: time="2020-05-11T10:06:25.160331591+02:00" level=warning msg="server unavailable err=failed to establish network connection: tls dial: dial tcp 192.168.64.3:6443: connect: connection refused address=192.168.64.3:6443 attempt=20"
followed by
May 11 10:06:26 node3 k3s[21207]: I0511 10:06:26.557062 21207 trace.go:116] Trace[725908169]: "Create" url:/api/v1/namespaces,user-agent:k3s/v1.18.2+k3s1 (linux/amd64) kubernetes/698e444,client:127.0.0.1 (started: 2020-05-11 10:05:52.547149949 +0200 CEST m=+135.705775041) (total time: 34.009764336s):
May 11 10:06:26 node3 k3s[21207]: Trace[725908169]: [34.009764336s] [34.009468004s] END
May 11 10:06:26 node3 k3s[21207]: E0511 10:06:26.559413 21207 controller.go:207] unable to create required kubernetes system namespace kube-system: Timeout: request did not complete within requested timeout 34s
May 11 10:06:27 node3 k3s[21207]: time="2020-05-11T10:06:27.869245125+02:00" level=error msg="Failed to connect to proxy" error="websocket: bad handshake"
May 11 10:06:27 node3 k3s[21207]: time="2020-05-11T10:06:27.871508631+02:00" level=error msg="Remotedialer proxy error" error="websocket: bad handshake"
May 11 10:06:27 node3 k3s[21207]: time="2020-05-11T10:06:27.881083327+02:00" level=warning msg="no known leader address=192.168.64.3:6443 attempt=49"
May 11 10:06:27 node3 k3s[21207]: time="2020-05-11T10:06:27.881965120+02:00" level=warning msg="no known leader address=192.168.64.3:6443 attempt=68"
...
May 11 10:06:29 node3 k3s[21207]: I0511 10:06:29.169976 21207 trace.go:116] Trace[513043459]: "List etcd3" key:/minions,resourceVersion:,limit:500,continue: (started: 2020-05-11 10:05:29.165133706 +0200 CEST m=+112.323763994) (total time: 1m0.004767816s):
May 11 10:06:29 node3 k3s[21207]: Trace[513043459]: [1m0.004767816s] [1m0.004767816s] END
May 11 10:06:29 node3 k3s[21207]: E0511 10:06:29.175960 21207 status.go:71] apiserver received an error that is not an metav1.Status: &errors.errorString{s:"context canceled"}
May 11 10:06:29 node3 k3s[21207]: E0511 10:06:29.177627 21207 writers.go:105] apiserver was unable to write a JSON response: http: Handler timeout
May 11 10:06:29 node3 k3s[21207]: E0511 10:06:29.178796 21207 status.go:71] apiserver received an error that is not an metav1.Status: &errors.errorString{s:"http: Handler timeout"}
May 11 10:06:29 node3 k3s[21207]: E0511 10:06:29.179879 21207 writers.go:118] apiserver was unable to write a fallback JSON response: http: Handler timeout
May 11 10:06:29 node3 k3s[21207]: I0511 10:06:29.182012 21207 trace.go:116] Trace[16562761]: "List" url:/api/v1/nodes,user-agent:kubectl/v1.18.2+k3s1 (linux/amd64) kubernetes/698e444,client:127.0.0.1 (started: 2020-05-11 10:05:29.164887063 +0200 CEST m=+112.323517550) (total time: 1m0.016441022s):
May 11 10:06:29 node3 k3s[21207]: Trace[16562761]: [1m0.016441022s] [1m0.016235552s] END
May 11 10:06:29 node3 k3s[21207]: time="2020-05-11T10:06:29.188705508+02:00" level=warning msg="no known leader address=192.168.64.3:6443 attempt=41"
May 11 10:06:29 node3 k3s[21207]: time="2020-05-11T10:06:29.293854107+02:00" level=warning msg="no known leader address=192.168.64.3:6443 attempt=39"
until the first node is back up, i cannot connect on any of the remaining master nodes:
ubuntu@node2:~$ sudo kubectl get nodes
Error from server (Timeout): the server was unable to return a response in the time allotted, but may still be processing the request (get nodes)
I assume #1770 meant to replace dqlite by the embedded version of etcd - so I don't think any further fixes will be made to dqlite.
Most helpful comment
I assume #1770 meant to replace dqlite by the embedded version of etcd - so I don't think any further fixes will be made to dqlite.