Zerotierone: zerotier-one.service 1.2.12 does not exit on shutdown when network is unreachable.

Created on 18 Dec 2018  路  14Comments  路  Source: zerotier/ZeroTierOne

Summary

While it looks like the same issue as #738 zerotier-one.service currently does not exit on timely manner, causing shutdown is blocked when the network is not connected. (e.g. when Wi-Fi is not connected on my laptop)

$ journalctl -b -1 -u zerotier-one.service outputs:

Dec 18 22:40:13 KBUMSIK-LG-Arch zerotier-one[617]: sendto: Network is unreachable
Dec 18 22:45:18 KBUMSIK-LG-Arch zerotier-one[617]: sendto: Network is unreachable
Dec 18 22:50:23 KBUMSIK-LG-Arch zerotier-one[617]: sendto: Network is unreachable
Dec 18 22:55:45 KBUMSIK-LG-Arch systemd[1]: Stopping ZeroTier One...
Dec 18 22:57:15 KBUMSIK-LG-Arch systemd[1]: zerotier-one.service: State 'stop-sigterm' timed out. Killing.
Dec 18 22:57:15 KBUMSIK-LG-Arch systemd[1]: zerotier-one.service: Killing process 617 (zerotier-one) with signal SIGKILL.
Dec 18 22:57:15 KBUMSIK-LG-Arch systemd[1]: zerotier-one.service: Main process exited, code=killed, status=9/KILL
Dec 18 22:57:15 KBUMSIK-LG-Arch systemd[1]: zerotier-one.service: Failed with result 'timeout'.
Dec 18 22:57:15 KBUMSIK-LG-Arch systemd[1]: Stopped ZeroTier One.

System information

OS: Arch Linux 64-bit
uname -a: Linux KBUMSIK-LG-Arch 4.19.8-arch1-1-ARCH #1 SMP PREEMPT Sat Dec 8 13:49:11 UTC 2018 x86_64 GNU/Linux
ZeroTier version: 1.2.12 (from the Arch Linux package)
systemd version: systemd-239.303-1

Most helpful comment

Why was this bug closed? It seems to be still an issue and the timeout is a dirty trick.

All 14 comments

FWIW also seeing this on FreeBSD.

@obadz Yes. I checked it and that line is already applied in my version.

This is applied in my system as well and I am seeing similar behavior. The following is a strace of the main process with -f flag when I issued a systemctl stop zerotier-one.service command:

The process recieved the SIGTERM, but continued to run.
Systemd issued the SIGKILL after timeout (I believe).

[pid 14637] <... restart_syscall resumed> ) = ? ERESTART_RESTARTBLOCK (Interrupted by signal)
[pid 14639] <... select resumed> )      = ? ERESTARTNOHAND (To be restarted if no handler)
[pid 14628] <... select resumed> )      = ? ERESTARTNOHAND (To be restarted if no handler)
[pid 14637] --- SIGTERM {si_signo=SIGTERM, si_code=SI_USER, si_pid=1, si_uid=0} ---
[pid 14637] --- SIGCONT {si_signo=SIGCONT, si_code=SI_USER, si_pid=1, si_uid=0} ---
[pid 14639] select(12, [9 11], [], [], NULL <unfinished ...>
[pid 14637] write(5, "\20", 1 <unfinished ...>
[pid 14628] select(36, [7 8 10 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35], [], [], {tv_sec=0, tv_usec=474915} <unfinished ...>
[pid 14637] <... write resumed> )       = 1
[pid 14637] rt_sigreturn({mask=[]})     = -1 EINTR (Interrupted system call)
[pid 14637] socket(AF_INET, SOCK_DGRAM, IPPROTO_IP) = 36
[pid 14637] fcntl(36, F_GETFL)          = 0x2 (flags O_RDWR)
[pid 14637] fcntl(36, F_SETFL, O_RDWR|O_NONBLOCK) = 0
[pid 14637] openat(AT_FDCWD, "/proc/net/route", O_RDONLY) = 37
[pid 14637] fstat(37, {st_mode=S_IFREG|0444, st_size=0, ...}) = 0
[pid 14637] read(37, "Iface\tDestination\tGateway \tFlags"..., 1024) = 640
[pid 14637] close(37)                   = 0
[pid 14637] connect(36, {sa_family=AF_INET, sin_port=htons(5351), sin_addr=inet_addr("192.168.5.1")}, 16) = 0
[pid 14637] sendto(36, "\0\0", 2, 0, NULL, 0) = 2
[pid 14637] gettimeofday({tv_sec=1547931566, tv_usec=750067}, NULL) = 0
[pid 14637] select(1024, [36], NULL, NULL, {tv_sec=0, tv_usec=249866}) = 1 (in [36], left {tv_sec=0, tv_usec=248928})
[pid 14637] recvfrom(36, 0x7f658e742e60, 16, 0, 0x7f658e742e70, [16]) = -1 ECONNREFUSED (Connection refused)
[pid 14637] gettimeofday({tv_sec=1547931566, tv_usec=751390}, NULL) = 0
[pid 14637] close(36)                   = 0
[pid 14637] socket(AF_UNIX, SOCK_STREAM, 0) = 36
[pid 14637] setsockopt(36, SOL_SOCKET, SO_RCVTIMEO, "\3\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0", 16) = 0
[pid 14637] setsockopt(36, SOL_SOCKET, SO_SNDTIMEO, "\3\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0", 16) = 0
[pid 14637] connect(36, {sa_family=AF_UNIX, sun_path="/var/run/minissdpd.sock"}, 110) = -1 ENOENT (No such file or directory)
[pid 14637] close(36)                   = 0
[pid 14637] socket(AF_INET, SOCK_DGRAM, IPPROTO_IP) = 36
[pid 14637] setsockopt(36, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
[pid 14637] setsockopt(36, SOL_IP, IP_MULTICAST_TTL, "\2", 1) = 0
[pid 14637] bind(36, {sa_family=AF_INET, sin_port=htons(0), sin_addr=inet_addr("0.0.0.0")}, 16) = 0
[pid 14637] sendto(36, "M-SEARCH * HTTP/1.1\r\nHOST: 239.2"..., 94, 0, {sa_family=AF_INET, sin_port=htons(1900), sin_addr=inet_addr("239.255.255.250")}, 16) = 94
[pid 14637] poll([{fd=36, events=POLLIN}], 1, 5000 <unfinished ...>
[pid 14628] <... select resumed> )      = 0 (Timeout)
[pid 14637] <... poll resumed> )        = 0 (Timeout)
[pid 14637] close(36)                   = 0
[pid 14637] nanosleep({tv_sec=300, tv_nsec=0},  <unfinished ...>
[pid 14639] <... select resumed> )      = 1 (in [9])
[pid 14639] read(9, "33\0\0\203\204\6\251J8\220\217\206\335`\ftr\1\25\21\1\376\200\0\0\0\0\0\0\4\251"..., 10064) = 331
[pid 14639] gettimeofday({tv_sec=1547931581, tv_usec=333972}, NULL) = 0
[pid 14639] select(12, [9 11], [], [], NULL) = 1 (in [9])
[pid 14639] read(9, "33\0\0\203\204\6\251J8\220\217\206\335`\ftr\1\25\21\1\376\200\0\0\0\0\0\0\4\251"..., 10064) = 331
[pid 14639] gettimeofday({tv_sec=1547931611, tv_usec=334451}, NULL) = 0
[pid 14639] select(12, [9 11], [], [], NULL^F) = 1 (in [9])
[pid 14639] read(9, "33\0\0\203\204\6\251J8\220\217\206\335`\ftr\1\25\21\1\376\200\0\0\0\0\0\0\4\251"..., 10064) = 331
[pid 14639] gettimeofday({tv_sec=1547931641, tv_usec=333917}, NULL) = 0
[pid 14639] select(12, [9 11], [], [], NULL <unfinished ...>) = ?
[pid 14637] <... nanosleep resumed> <unfinished ...>) = ?
[pid 14639] +++ killed by SIGKILL +++
[pid 14637] +++ killed by SIGKILL +++
+++ killed by SIGKILL +++

attempting fix used in NixOS: https://github.com/NixOS/nixpkgs/pull/49423 by removing After and using BindsTo

BindsTo did not resolve the issue.

A fresh installation on ubuntu 18.04.2 LTS applying the https://github.com/zerotier/ZeroTierOne/issues/738#issuecomment-387697755 does not solve the problem. Still hangs while shutting down.

Pretty annoying minor bug in my opinion as it prevents shut downs. Previous work arounds with the service unit have failed me. Any updates on this?

Personally I have:

/etc/systemd/system/multi-user.target.wants/zerotier-one.service :
[service]
TimeoutSec=10

ZT1 has 10 seconds to do the deed or it gets the axe. Works well.
It's obviously not an optimal solution by any means, but it beats either waiting 120 seconds or Unlimited seconds, depending on which environment it's running under.

I'll try the time out option. Thanks for the tip @Arffeh.

Why was this bug closed? It seems to be still an issue and the timeout is a dirty trick.

1128 may be related.

This bug really should be reopened - it's not fixed, only addressed with a cheap hack to make it less noticeable.

I'm still experiencing this issue on Ubuntu 18.04 and ZT-One 1.4.6.

Ubuntu 20.04 - still an issue :/

Was this page helpful?
0 / 5 - 0 ratings