K3s: "systemd stop" leaves processes running

Created on 27 Mar 2020  ยท  2Comments  ยท  Source: k3s-io/k3s

Version:

k3s version v1.17.4+k3s1 (3eee8ac3)

K3s arguments:

export INSTALL_K3S_EXEC="--node-external-ip 10.127.0.1"
export INSTALL_K3S_NAME="master-LOCCH"
k3s-install.sh

Describe the bug

On both server and agents we've found that stopping the service results in exit-code/failure and one or more processes belonging to the service remain of the form:

```ps -efly | grep ranch
... containerd-shim-runc-v2 -namespace k8s.io ...

We've noticed this whilst trying to solve network external IP address PENDING and related issues.

**To Reproduce**
1. reboot server
2. stop service

**Additional context / logs**

Not clear where to find appropriate logs. containerd.log doesn't have anything related to the time-period when the 'stop' is issued.

```systemctl status k3s-master-LOCCH
โ— k3s-master-LOCCH.service - Lightweight Kubernetes
     Loaded: loaded (/etc/systemd/system/k3s-master-LOCCH.service; enabled; vendor preset: enabled)
     Active: failed (Result: exit-code) since Fri 2020-03-27 11:19:07 GMT; 31min ago
       Docs: https://k3s.io
    Process: 810 ExecStartPre=/sbin/modprobe br_netfilter (code=exited, status=0/SUCCESS)
    Process: 828 ExecStartPre=/sbin/modprobe overlay (code=exited, status=0/SUCCESS)
    Process: 829 ExecStart=/usr/local/bin/k3s server --node-external-ip 10.127.0.1 (code=exited, status=1/FAILURE)
   Main PID: 829 (code=exited, status=1/FAILURE)
      Tasks: 102
     Memory: 175.1M
     CGroup: /system.slice/k3s-master-LOCCH.service
             โ”œโ”€1562 /var/lib/rancher/k3s/data/6a3098e6644f5f0dbfe14e5efa99bb8fdf60d63cae89fdffd71b7de11a1f1430/bin/containerd-shim-runc-v2 -names>
             โ”œโ”€1617 /pause
             โ”œโ”€1816 /var/lib/rancher/k3s/data/6a3098e6644f5f0dbfe14e5efa99bb8fdf60d63cae89fdffd71b7de11a1f1430/bin/containerd-shim-runc-v2 -names>
             โ”œโ”€1819 /var/lib/rancher/k3s/data/6a3098e6644f5f0dbfe14e5efa99bb8fdf60d63cae89fdffd71b7de11a1f1430/bin/containerd-shim-runc-v2 -names>
             โ”œโ”€1889 /pause
             โ”œโ”€1901 /pause
             โ”œโ”€1974 /coredns -conf /etc/coredns/Corefile
             โ”œโ”€1981 /metrics-server
             โ”œโ”€2472 /var/lib/rancher/k3s/data/6a3098e6644f5f0dbfe14e5efa99bb8fdf60d63cae89fdffd71b7de11a1f1430/bin/containerd-shim-runc-v2 -names>
             โ”œโ”€2512 /var/lib/rancher/k3s/data/6a3098e6644f5f0dbfe14e5efa99bb8fdf60d63cae89fdffd71b7de11a1f1430/bin/containerd-shim-runc-v2 -names>
             โ”œโ”€2542 /var/lib/rancher/k3s/data/6a3098e6644f5f0dbfe14e5efa99bb8fdf60d63cae89fdffd71b7de11a1f1430/bin/containerd-shim-runc-v2 -names>
             โ”œโ”€2546 /pause
             โ”œโ”€2558 /pause
             โ”œโ”€2619 /pause
             โ”œโ”€2672 /var/lib/rancher/k3s/data/6a3098e6644f5f0dbfe14e5efa99bb8fdf60d63cae89fdffd71b7de11a1f1430/bin/containerd-shim-runc-v2 -names>
             โ”œโ”€2699 /traefik --configfile=/config/traefik.toml
             โ”œโ”€2701 /opt/bitnami/apache/bin/httpd -f /opt/bitnami/apache/conf/httpd.conf -D FOREGROUND
             โ”œโ”€2733 /pause
             โ”œโ”€2783 /bin/sh /usr/bin/entry
             โ”œโ”€2789 /opt/bitnami/apache/bin/httpd -f /opt/bitnami/apache/conf/httpd.conf -D FOREGROUND
             โ”œโ”€2790 /opt/bitnami/apache/bin/httpd -f /opt/bitnami/apache/conf/httpd.conf -D FOREGROUND
             โ”œโ”€2791 /opt/bitnami/apache/bin/httpd -f /opt/bitnami/apache/conf/httpd.conf -D FOREGROUND
             โ”œโ”€2792 /opt/bitnami/apache/bin/httpd -f /opt/bitnami/apache/conf/httpd.conf -D FOREGROUND
             โ”œโ”€2793 /opt/bitnami/apache/bin/httpd -f /opt/bitnami/apache/conf/httpd.conf -D FOREGROUND
             โ”œโ”€2826 /bin/sh /usr/bin/entry
             โ”œโ”€2872 /bin/sh -c node srv.js
             โ””โ”€2888 node srv.js

Mar 27 11:19:07 elloe01 k3s[829]: time="2020-03-27T11:19:07.763091751Z" level=info msg="Shutting down /v1, Kind=Endpoints workers"
Mar 27 11:19:07 elloe01 k3s[829]: time="2020-03-27T11:19:07.763152451Z" level=info msg="Shutting down /v1, Kind=Pod workers"
Mar 27 11:19:07 elloe01 k3s[829]: time="2020-03-27T11:19:07.763201351Z" level=info msg="Shutting down /v1, Kind=Service workers"
Mar 27 11:19:07 elloe01 k3s[829]: time="2020-03-27T11:19:07.763247552Z" level=info msg="Shutting down /v1, Kind=Node workers"
Mar 27 11:19:07 elloe01 k3s[829]: time="2020-03-27T11:19:07.763294452Z" level=info msg="Shutting down batch/v1, Kind=Job workers"
Mar 27 11:19:07 elloe01 k3s[829]: time="2020-03-27T11:19:07.763338752Z" level=info msg="Shutting down helm.cattle.io/v1, Kind=HelmChart workers"
Mar 27 11:19:07 elloe01 k3s[829]: time="2020-03-27T11:19:07.763388352Z" level=fatal msg="controllers exited"
Mar 27 11:19:07 elloe01 systemd[1]: k3s-master-LOCCH.service: Main process exited, code=exited, status=1/FAILURE

On the agents:

root@innovation00:~# systemctl status k3s-agent-LOCCH
โ— k3s-agent-LOCCH.service - Lightweight Kubernetes
     Loaded: loaded (/etc/systemd/system/k3s-agent-LOCCH.service; enabled; vendor preset: enabled)
     Active: failed (Result: exit-code) since Fri 2020-03-27 10:54:39 UTC; 58min ago
       Docs: https://k3s.io
    Process: 27715 ExecStartPre=/sbin/modprobe br_netfilter (code=exited, status=0/SUCCESS)
    Process: 27717 ExecStartPre=/sbin/modprobe overlay (code=exited, status=0/SUCCESS)
    Process: 27726 ExecStart=/usr/local/bin/k3s agent (code=exited, status=1/FAILURE)
   Main PID: 27726 (code=exited, status=1/FAILURE)
      Tasks: 0
     Memory: 1.7G
     CGroup: /system.slice/k3s-agent-LOCCH.service

Mar 27 10:50:00 innovation00 k3s[27726]: E0327 10:50:00.940429   27726 reflector.go:153] k8s.io/kubernetes/pkg/kubelet/kubelet.go:458: Failed to >
Mar 27 10:50:12 innovation00 k3s[27726]: time="2020-03-27T10:50:12.844926945Z" level=error msg="Failed to connect to proxy" error="dial tcp 10.12>
Mar 27 10:50:12 innovation00 k3s[27726]: time="2020-03-27T10:50:12.845011954Z" level=error msg="Remotedialer proxy error" error="dial tcp 10.127.>
Mar 27 10:50:17 innovation00 k3s[27726]: time="2020-03-27T10:50:17.845284699Z" level=info msg="Connecting to proxy" url="wss://10.127.0.1:6443/v1>
Mar 27 10:54:39 innovation00 k3s[27726]: time="2020-03-27T10:54:39.887071777Z" level=fatal msg="context canceled"
Mar 27 10:54:39 innovation00 k3s[27726]: I0327 10:54:39.887132   27726 network_policy_controller.go:172] Shutting down network policies controller
Mar 27 10:54:39 innovation00 systemd[1]: Stopping Lightweight Kubernetes...
Mar 27 10:54:39 innovation00 systemd[1]: k3s-agent-LOCCH.service: Main process exited, code=exited, status=1/FAILURE
Mar 27 10:54:39 innovation00 systemd[1]: k3s-agent-LOCCH.service: Failed with result 'exit-code'.
Mar 27 10:54:39 innovation00 systemd[1]: Stopped Lightweight Kubernetes.

Most helpful comment

For reference, the k3s-install.sh script is embedded inside install.sh.

The ArchLinux k3s-bin AUR package does not use the installer script, so this will get the script onto your system:

curl https://raw.githubusercontent.com/rancher/k3s/3c98290f0be546cdd12668d8f59cee66ca44c0a1/install.sh | awk '449<=NR && NR<=524' > k3s-killall.sh
chmod +x k3s-killall.sh

All 2 comments

This is the expected behavior. It is needed if you want zero down-time upgrades (or close to). At the moment there is a k3s-killall.sh script which can be used to take down the service and containers, if that is desired.

For reference, the k3s-install.sh script is embedded inside install.sh.

The ArchLinux k3s-bin AUR package does not use the installer script, so this will get the script onto your system:

curl https://raw.githubusercontent.com/rancher/k3s/3c98290f0be546cdd12668d8f59cee66ca44c0a1/install.sh | awk '449<=NR && NR<=524' > k3s-killall.sh
chmod +x k3s-killall.sh
Was this page helpful?
0 / 5 - 0 ratings

Related issues

weber-software picture weber-software  ยท  3Comments

dduportal picture dduportal  ยท  4Comments

gilkotton picture gilkotton  ยท  3Comments

davidnuzik picture davidnuzik  ยท  3Comments

ubergeek801 picture ubergeek801  ยท  3Comments