k3s startup fails with "starting kubernetes: preparing server: start cluster and https: raft_start(): io: load closed segment 0000000024946269-0000000024946590: found 321 entries (expected 322)"

Created on 9 Feb 2020  路  2Comments  路  Source: k3s-io/k3s

Version: k3s version v1.17.2+k3s1 (cdab19b0)

Description:

k3s master fails to start with in the log "starting kubernetes: preparing server: start cluster and https: raft_start(): io: load closed segment 0000000024946269-0000000024946590: found 321 entries (expected 322)"

This has happened after the machines were forcefully shut down (power loss). There's no info on the web on how to resolve this error or what to do next.

To Reproduce:

  • install cluster using Ansible scripts on at least two nodes
  • unplug power (I guess?)

Expected behavior:

  • cluster survives power outages / gives a clear path how to restore it manually

Actual behavior:

  • cluster doesn't startup anymore

Additional context

  • k3s is (was) running on a cluster of TWO machines
  • k3s non-master node seems to start up successfully
  • k3s is installed on almost clean Armbian, on Pine64
  • cluster was working fine before the power loss
uname -a
Linux ariana 5.4.7-sunxi64 #19.11.6 SMP Sat Jan 4 19:40:10 CET 2020 aarch64 GNU/Linux


lsb_release -a
No LSB modules are available.
Distributor ID: Debian
Description:    Debian GNU/Linux 10 (buster)
Release:    10
Codename:   buster


cat /etc/systemd/system/k3s.service
[Unit]
Description=Lightweight Kubernetes
Documentation=https://k3s.io
After=network-online.target
[Service]
ExecStartPre=-/sbin/modprobe br_netfilter
ExecStartPre=-/sbin/modprobe overlay
ExecStart=/usr/local/bin/k3s server --cluster-init --write-kubeconfig-mode 664
KillMode=process
Delegate=yes
LimitNOFILE=infinity
LimitNPROC=infinity
LimitCORE=infinity
TasksMax=infinity
Restart=always
RestartSec=5s

[Install]
WantedBy=multi-user.target

/var/log/syslog:

...
Feb  9 00:00:12 ariana systemd[1]: Starting Lightweight Kubernetes...
Feb  9 00:00:12 ariana systemd[1]: Started Lightweight Kubernetes.
Feb  9 00:00:13 ariana k3s[3961]: time="2020-02-09T00:00:13.429349422Z" level=info msg="Starting k3s v1.17.2+k3s1 (cdab19b0)"
Feb  9 00:00:16 ariana k3s[3961]: time="2020-02-09T00:00:16.592512841Z" level=fatal msg="starting kubernetes: preparing server: start cluster and https: raft_start(): io: load closed segment 0000000024946269-0000000024946590: found 321 entries (expected 322)"
Feb  9 00:00:16 ariana systemd[1]: k3s.service: Main process exited, code=exited, status=1/FAILURE
Feb  9 00:00:16 ariana systemd[1]: k3s.service: Failed with result 'exit-code'.
Feb  9 00:00:21 ariana systemd[1]: k3s.service: Service RestartSec=5s expired, scheduling restart.
Feb  9 00:00:21 ariana systemd[1]: k3s.service: Scheduled restart job, restart counter is at 5380.
Feb  9 00:00:21 ariana systemd[1]: Stopped Lightweight Kubernetes.
...

Most helpful comment

This appears to be the upstream dqlite issue: https://github.com/canonical/dqlite/issues/190

dqlite is still experimental; there does not appear to be a way to recover from this at the moment. If you need more production-ready HA you should probably be using an external DB.

Also, a two-node dqlite cluster won't meet Raft consensus requirements (no quorum if one goes down) so this setup probably won't ever work as expected.

All 2 comments

seeing the same issues, I was purposefully deleting master nodes at various intervals and discovered this on reboot after a couple of times.

This appears to be the upstream dqlite issue: https://github.com/canonical/dqlite/issues/190

dqlite is still experimental; there does not appear to be a way to recover from this at the moment. If you need more production-ready HA you should probably be using an external DB.

Also, a two-node dqlite cluster won't meet Raft consensus requirements (no quorum if one goes down) so this setup probably won't ever work as expected.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

ubergeek801 picture ubergeek801  路  3Comments

davidnuzik picture davidnuzik  路  3Comments

weber-software picture weber-software  路  3Comments

joakimr-axis picture joakimr-axis  路  3Comments

giezi picture giezi  路  3Comments