After a reload of icinga2 there is no icinga2 process via /etc/init.d/icinga2 reload there is no icinga2 process anymore.
OS: Debian 9.3
Icinga Version:
icinga2 2.8.1+406.g407a2d052.2018.02.07+1.stretch-0 amd64
icinga2-bin 2.8.1+406.g407a2d052.2018.02.07+1.stretch-0 amd64
icinga2-common 2.8.1+406.g407a2d052.2018.02.07+1.stretch-0 all
icinga2-doc 2.8.1+406.g407a2d052.2018.02.07+1.stretch-0 all
icinga2-ido-mysql 2.8.1+406.g407a2d052.2018.02.07+1.stretch-0 amd64
libicinga2 2.8.1+406.g407a2d052.2018.02.07+1.stretch-0 amd64
Debug Logfile output:
[2018-02-07 09:11:06 +0100] information/Application: Got reload command: Starting new instance.
[2018-02-07 09:11:06 +0100] notice/Process: Running command '/usr/lib/x86_64-linux-gnu/icinga2/sbin/icinga2' '--no-stack-rlimit' 'daemon' '-e' '/var/log/icinga2/error.log' '--reload-internal' '27264': PID 27446
[2018-02-07 09:11:07 +0100] debug/IdoMysqlConnection: Query: COMMIT
[2018-02-07 09:11:07 +0100] debug/IdoMysqlConnection: Query: BEGIN
[2018-02-07 09:11:07 +0100] information/Application: Reload requested, letting new process take over.
After that no output anymore and no icinga2 process is running anymore.
@Crunsher since you've change parts with the reload, any ideas?
Reload requested, letting new process take over.
This definitely came with my change. Do you run SELinux? It could block the SIGUSR2 signal, possibly.
@Crunsher SELinux is currently disabled. In the current version 2.8.1+414.gbb96b7742.2018.02.10+1.stretch-0 this issue is still active.
I was unable to reproduce this on Centos 6 with (don't have a debian with custom sysvinit lying about)
[root@localhost ~]# /etc/init.d/icinga2 start
Checking configuration: Done
Starting Icinga 2: Done
[root@localhost ~]# /etc/init.d/icinga2 status
Icinga 2 status: Running
[root@localhost ~]# /etc/init.d/icinga2 reload
Validating config files: Done
Reloading Icinga 2: Done
[root@localhost ~]# /etc/init.d/icinga2 status
Icinga 2 status: Running
[root@localhost ~]# ps ax | grep icinga2
2631 ? Ssl 0:00 /usr/lib64/icinga2/sbin/icinga2 --no-stack-rlimit daemon -c /etc/icinga2/icinga2.conf -d -e /var/log/icinga2/error.log --reload-internal 2349
2655 ? S 0:00 /usr/lib64/icinga2/sbin/icinga2 --no-stack-rlimit daemon -c /etc/icinga2/icinga2.conf -d -e /var/log/icinga2/error.log --reload-internal 2349
2768 pts/0 S+ 0:00 grep icinga2
Could you please run the reload with strace? strace /etc/init.d/icinga2 reload
strace /etc/init.d/icinga2 reload
execve("/etc/init.d/icinga2", ["/etc/init.d/icinga2", "reload"], [/* 24 vars */]) = 0
brk(NULL) = 0x563d33984000
access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory)
mmap(NULL, 12288, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fab5c65a000
access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory)
open("/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=65606, ...}) = 0
mmap(NULL, 65606, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7fab5c649000
close(3) = 0
access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory)
open("/lib/x86_64-linux-gnu/libc.so.6", O_RDONLY|O_CLOEXEC) = 3
read(3, "\177ELF\2\1\1\3\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\320\3\2\0\0\0\0\0"..., 832) = 832
fstat(3, {st_mode=S_IFREG|0755, st_size=1689360, ...}) = 0
mmap(NULL, 3795360, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7fab5c09b000
mprotect(0x7fab5c230000, 2097152, PROT_NONE) = 0
mmap(0x7fab5c430000, 24576, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x195000) = 0x7fab5c430000
mmap(0x7fab5c436000, 14752, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7fab5c436000
close(3) = 0
mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fab5c647000
arch_prctl(ARCH_SET_FS, 0x7fab5c647700) = 0
mprotect(0x7fab5c430000, 16384, PROT_READ) = 0
mprotect(0x563d327f9000, 8192, PROT_READ) = 0
mprotect(0x7fab5c65d000, 4096, PROT_READ) = 0
munmap(0x7fab5c649000, 65606) = 0
getpid() = 27306
rt_sigaction(SIGCHLD, {sa_handler=0x563d325efef0, sa_mask=~[RTMIN RT_1], sa_flags=SA_RESTORER, sa_restorer=0x7fab5c0ce030}, NULL, 8) = 0
geteuid() = 0
brk(NULL) = 0x563d33984000
brk(0x563d339a5000) = 0x563d339a5000
getppid() = 27304
stat("/root", {st_mode=S_IFDIR|0700, st_size=4096, ...}) = 0
stat(".", {st_mode=S_IFDIR|0700, st_size=4096, ...}) = 0
open("/etc/init.d/icinga2", O_RDONLY) = 3
fcntl(3, F_DUPFD, 10) = 10
close(3) = 0
fcntl(10, F_SETFD, FD_CLOEXEC) = 0
rt_sigaction(SIGINT, NULL, {sa_handler=SIG_DFL, sa_mask=[], sa_flags=0}, 8) = 0
rt_sigaction(SIGINT, {sa_handler=0x563d325efef0, sa_mask=~[RTMIN RT_1], sa_flags=SA_RESTORER, sa_restorer=0x7fab5c0ce030}, NULL, 8) = 0
rt_sigaction(SIGQUIT, NULL, {sa_handler=SIG_DFL, sa_mask=[], sa_flags=0}, 8) = 0
rt_sigaction(SIGQUIT, {sa_handler=SIG_DFL, sa_mask=~[RTMIN RT_1], sa_flags=SA_RESTORER, sa_restorer=0x7fab5c0ce030}, NULL, 8) = 0
rt_sigaction(SIGTERM, NULL, {sa_handler=SIG_DFL, sa_mask=[], sa_flags=0}, 8) = 0
rt_sigaction(SIGTERM, {sa_handler=SIG_DFL, sa_mask=~[RTMIN RT_1], sa_flags=SA_RESTORER, sa_restorer=0x7fab5c0ce030}, NULL, 8) = 0
read(10, "#! /bin/sh\n### BEGIN INIT INFO\n#"..., 8192) = 6513
rt_sigaction(SIGPIPE, NULL, {sa_handler=SIG_DFL, sa_mask=[], sa_flags=0}, 8) = 0
rt_sigaction(SIGPIPE, {sa_handler=SIG_IGN, sa_mask=~[RTMIN RT_1], sa_flags=SA_RESTORER, sa_restorer=0x7fab5c0ce030}, NULL, 8) = 0
geteuid() = 0
stat("/usr/sbin/icinga2", {st_mode=S_IFREG|0755, st_size=840, ...}) = 0
faccessat(AT_FDCWD, "/usr/sbin/icinga2", X_OK) = 0
faccessat(AT_FDCWD, "/etc/default/icinga2", R_OK) = 0
open("/etc/default/icinga2", O_RDONLY) = 3
fcntl(3, F_DUPFD, 10) = 11
close(3) = 0
fcntl(11, F_SETFD, FD_CLOEXEC) = 0
read(11, "# default settings for icinga2's"..., 8192) = 92
read(11, "", 8192) = 0
close(11) = 0
open("/lib/init/vars.sh", O_RDONLY) = 3
fcntl(3, F_DUPFD, 10) = 11
close(3) = 0
fcntl(11, F_SETFD, FD_CLOEXEC) = 0
read(11, "#\n# Set rcS vars\n#\n\n# Because /e"..., 8192) = 1212
stat("/etc/default/rcS", {st_mode=S_IFREG|0644, st_size=821, ...}) = 0
open("/etc/default/rcS", O_RDONLY) = 3
fcntl(3, F_DUPFD, 10) = 12
close(3) = 0
fcntl(12, F_SETFD, FD_CLOEXEC) = 0
read(12, "################################"..., 8192) = 821
read(12, "", 8192) = 0
close(12) = 0
faccessat(AT_FDCWD, "/proc/cmdline", R_OK) = 0
pipe([3, 4]) = 0
clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7fab5c6479d0) = 27307
close(4) = 0
read(3, "BOOT_IMAGE=/boot/vmlinuz-4.9.0-5"..., 128) = 95
read(3, "", 128) = 0
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=27307, si_uid=0, si_status=0, si_utime=0, si_stime=0} ---
rt_sigreturn({mask=[]}) = 0
close(3) = 0
wait4(-1, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 27307
read(11, "", 8192) = 0
close(11) = 0
open("/lib/lsb/init-functions", O_RDONLY) = 3
fcntl(3, F_DUPFD, 10) = 11
close(3) = 0
fcntl(11, F_SETFD, FD_CLOEXEC) = 0
read(11, "# /lib/lsb/init-functions for De"..., 8192) = 8192
read(11, "# On Debian, would output \"Start"..., 8192) = 3318
pipe([3, 4]) = 0
clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7fab5c6479d0) = 27308
close(4) = 0
read(3, "/lib/lsb/init-functions.d/20-lef"..., 128) = 83
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=27308, si_uid=0, si_status=0, si_utime=0, si_stime=0} ---
rt_sigreturn({mask=[]}) = 83
read(3, "", 128) = 0
close(3) = 0
wait4(-1, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 27308
faccessat(AT_FDCWD, "/lib/lsb/init-functions.d/20-left-info-blocks", R_OK) = 0
open("/lib/lsb/init-functions.d/20-left-info-blocks", O_RDONLY) = 3
fcntl(3, F_DUPFD, 10) = 12
close(3) = 0
fcntl(12, F_SETFD, FD_CLOEXEC) = 0
read(12, "# Default info blocks put to the"..., 8192) = 1088
read(12, "", 8192) = 0
close(12) = 0
faccessat(AT_FDCWD, "/lib/lsb/init-functions.d/40-systemd", R_OK) = 0
open("/lib/lsb/init-functions.d/40-systemd", O_RDONLY) = 3
fcntl(3, F_DUPFD, 10) = 12
close(3) = 0
fcntl(12, F_SETFD, FD_CLOEXEC) = 0
read(12, "# -*-Shell-script-*-\n# /lib/lsb/"..., 8192) = 2942
stat("/run/systemd/system", {st_mode=S_IFDIR|0755, st_size=40, ...}) = 0
pipe([3, 4]) = 0
clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7fab5c6479d0) = 27309
close(4) = 0
read(3, "loaded\n", 128) = 7
read(3, "", 128) = 0
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=27309, si_uid=0, si_status=0, si_utime=0, si_stime=0} ---
rt_sigreturn({mask=[]}) = 0
close(3) = 0
wait4(-1, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 27309
pipe([3, 4]) = 0
clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7fab5c6479d0) = 27310
close(4) = 0
read(3, "/etc/init.d/icinga2\n", 128) = 20
read(3, "", 128) = 0
close(3) = 0
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=27310, si_uid=0, si_status=0, si_utime=0, si_stime=0} ---
rt_sigreturn({mask=[]}) = 0
wait4(-1, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 27310
pipe([3, 4]) = 0
clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7fab5c6479d0) = 27311
close(4) = 0
read(3, "yes\n", 128) = 4
read(3, "", 128) = 0
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=27311, si_uid=0, si_status=0, si_utime=0, si_stime=0} ---
rt_sigreturn({mask=[]}) = 0
close(3) = 0
wait4(-1, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 27311
pipe([3, 4]) = 0
clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7fab5c6479d0) = 27312
close(4) = 0
read(3, "degraded\n", 128) = 9
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=27312, si_uid=0, si_status=1, si_utime=0, si_stime=0} ---
rt_sigreturn({mask=[]}) = 9
read(3, "", 128) = 0
close(3) = 0
wait4(-1, [{WIFEXITED(s) && WEXITSTATUS(s) == 1}], 0, NULL) = 27312
ioctl(1, TCGETS, {B38400 opost isig icanon echo ...}) = 0
geteuid() = 0
stat("/usr/bin/tput", {st_mode=S_IFREG|0755, st_size=18456, ...}) = 0
faccessat(AT_FDCWD, "/usr/bin/tput", X_OK) = 0
geteuid() = 0
stat("/usr/bin/expr", {st_mode=S_IFREG|0755, st_size=43848, ...}) = 0
faccessat(AT_FDCWD, "/usr/bin/expr", X_OK) = 0
open("/dev/null", O_WRONLY|O_CREAT|O_TRUNC, 0666) = 3
fcntl(1, F_DUPFD, 10) = 13
close(1) = 0
fcntl(13, F_SETFD, FD_CLOEXEC) = 0
dup2(3, 1) = 1
close(3) = 0
fcntl(2, F_DUPFD, 10) = 14
close(2) = 0
fcntl(14, F_SETFD, FD_CLOEXEC) = 0
dup2(1, 2) = 2
clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7fab5c6479d0) = 27313
wait4(-1, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 27313
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=27313, si_uid=0, si_status=0, si_utime=0, si_stime=0} ---
rt_sigreturn({mask=[]}) = 27313
dup2(13, 1) = 1
close(13) = 0
dup2(14, 2) = 2
close(14) = 0
open("/dev/null", O_WRONLY|O_CREAT|O_TRUNC, 0666) = 3
fcntl(1, F_DUPFD, 10) = 13
close(1) = 0
fcntl(13, F_SETFD, FD_CLOEXEC) = 0
dup2(3, 1) = 1
close(3) = 0
fcntl(2, F_DUPFD, 10) = 14
close(2) = 0
fcntl(14, F_SETFD, FD_CLOEXEC) = 0
dup2(1, 2) = 2
clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7fab5c6479d0) = 27314
wait4(-1, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 27314
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=27314, si_uid=0, si_status=0, si_utime=0, si_stime=0} ---
rt_sigreturn({mask=[]}) = 27314
dup2(13, 1) = 1
close(13) = 0
dup2(14, 2) = 2
close(14) = 0
write(1, "[....] ", 7[....] ) = 7
write(1, "Reloading icinga2 configuration "..., 64Reloading icinga2 configuration (via systemctl): icinga2.service) = 64
clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7fab5c6479d0) = 27315
wait4(-1, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 27315
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=27315, si_uid=0, si_status=0, si_utime=0, si_stime=0} ---
rt_sigreturn({mask=[]}) = 27315
ioctl(1, TCGETS, {B38400 opost isig icanon echo ...}) = 0
geteuid() = 0
stat("/usr/bin/tput", {st_mode=S_IFREG|0755, st_size=18456, ...}) = 0
faccessat(AT_FDCWD, "/usr/bin/tput", X_OK) = 0
geteuid() = 0
stat("/usr/bin/expr", {st_mode=S_IFREG|0755, st_size=43848, ...}) = 0
faccessat(AT_FDCWD, "/usr/bin/expr", X_OK) = 0
open("/dev/null", O_WRONLY|O_CREAT|O_TRUNC, 0666) = 3
fcntl(1, F_DUPFD, 10) = 13
close(1) = 0
fcntl(13, F_SETFD, FD_CLOEXEC) = 0
dup2(3, 1) = 1
close(3) = 0
fcntl(2, F_DUPFD, 10) = 14
close(2) = 0
fcntl(14, F_SETFD, FD_CLOEXEC) = 0
dup2(1, 2) = 2
clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7fab5c6479d0) = 27381
wait4(-1, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 27381
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=27381, si_uid=0, si_status=0, si_utime=0, si_stime=0} ---
rt_sigreturn({mask=[]}) = 27381
dup2(13, 1) = 1
close(13) = 0
dup2(14, 2) = 2
close(14) = 0
open("/dev/null", O_WRONLY|O_CREAT|O_TRUNC, 0666) = 3
fcntl(1, F_DUPFD, 10) = 13
close(1) = 0
fcntl(13, F_SETFD, FD_CLOEXEC) = 0
dup2(3, 1) = 1
close(3) = 0
fcntl(2, F_DUPFD, 10) = 14
close(2) = 0
fcntl(14, F_SETFD, FD_CLOEXEC) = 0
dup2(1, 2) = 2
clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7fab5c6479d0) = 27382
wait4(-1, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 27382
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=27382, si_uid=0, si_status=0, si_utime=0, si_stime=0} ---
rt_sigreturn({mask=[]}) = 27382
dup2(13, 1) = 1
close(13) = 0
dup2(14, 2) = 2
close(14) = 0
pipe([3, 4]) = 0
clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7fab5c6479d0) = 27383
close(4) = 0
read(3, "\33[31m", 128) = 5
read(3, "", 128) = 0
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=27383, si_uid=0, si_status=0, si_utime=0, si_stime=0} ---
rt_sigreturn({mask=[]}) = 0
close(3) = 0
wait4(-1, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 27383
pipe([3, 4]) = 0
clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7fab5c6479d0) = 27384
close(4) = 0
read(3, "\33[32m", 128) = 5
read(3, "", 128) = 0
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=27384, si_uid=0, si_status=0, si_utime=0, si_stime=0} ---
rt_sigreturn({mask=[]}) = 0
close(3) = 0
wait4(-1, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 27384
pipe([3, 4]) = 0
clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7fab5c6479d0) = 27385
close(4) = 0
read(3, "\33[33m", 128) = 5
read(3, "", 128) = 0
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=27385, si_uid=0, si_status=0, si_utime=0, si_stime=0} ---
rt_sigreturn({mask=[]}) = 0
close(3) = 0
wait4(-1, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 27385
pipe([3, 4]) = 0
clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7fab5c6479d0) = 27386
close(4) = 0
read(3, "\33[39;49m", 128) = 8
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=27386, si_uid=0, si_status=0, si_utime=0, si_stime=0} ---
rt_sigreturn({mask=[]}) = 94820858675199
read(3, "", 128) = 0
close(3) = 0
wait4(-1, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 27386
clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7fab5c6479d0) = 27387
wait4(-1, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 27387
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=27387, si_uid=0, si_status=0, si_utime=0, si_stime=0} ---
rt_sigreturn({mask=[]}) = 27387
clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7fab5c6479d0) = 27388
wait4(-1, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 27388
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=27388, si_uid=0, si_status=0, si_utime=0, si_stime=0} ---
rt_sigreturn({mask=[]}) = 27388
clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7fab5c6479d0) = 27389
[{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 27389
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=27389, si_uid=0, si_status=0, si_utime=0, si_stime=0} ---
rt_sigreturn({mask=[]}) = 27389
clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7fab5c6479d0) = 27390
wait4(-1, [ ok [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 27390
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=27390, si_uid=0, si_status=0, si_utime=0, si_stime=0} ---
rt_sigreturn({mask=[]}) = 27390
clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7fab5c6479d0) = 27391
wait4(-1, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 27391
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=27391, si_uid=0, si_status=0, si_utime=0, si_stime=0} ---
rt_sigreturn({mask=[]}) = 27391
clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7fab5c6479d0) = 27392
wait4(-1, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 27392
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=27392, si_uid=0, si_status=0, si_utime=0, si_stime=0} ---
rt_sigreturn({mask=[]}) = 27392
ioctl(1, TCGETS, {B38400 opost isig icanon echo ...}) = 0
geteuid() = 0
stat("/usr/bin/tput", {st_mode=S_IFREG|0755, st_size=18456, ...}) = 0
faccessat(AT_FDCWD, "/usr/bin/tput", X_OK) = 0
geteuid() = 0
stat("/usr/bin/expr", {st_mode=S_IFREG|0755, st_size=43848, ...}) = 0
faccessat(AT_FDCWD, "/usr/bin/expr", X_OK) = 0
open("/dev/null", O_WRONLY|O_CREAT|O_TRUNC, 0666) = 3
fcntl(1, F_DUPFD, 10) = 13
close(1) = 0
fcntl(13, F_SETFD, FD_CLOEXEC) = 0
dup2(3, 1) = 1
close(3) = 0
fcntl(2, F_DUPFD, 10) = 14
close(2) = 0
fcntl(14, F_SETFD, FD_CLOEXEC) = 0
dup2(1, 2) = 2
clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7fab5c6479d0) = 27393
wait4(-1, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 27393
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=27393, si_uid=0, si_status=0, si_utime=0, si_stime=0} ---
rt_sigreturn({mask=[]}) = 27393
dup2(13, 1) = 1
close(13) = 0
dup2(14, 2) = 2
close(14) = 0
open("/dev/null", O_WRONLY|O_CREAT|O_TRUNC, 0666) = 3
fcntl(1, F_DUPFD, 10) = 13
close(1) = 0
fcntl(13, F_SETFD, FD_CLOEXEC) = 0
dup2(3, 1) = 1
close(3) = 0
fcntl(2, F_DUPFD, 10) = 14
close(2) = 0
fcntl(14, F_SETFD, FD_CLOEXEC) = 0
dup2(1, 2) = 2
clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7fab5c6479d0) = 27395
wait4(-1, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 27395
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=27395, si_uid=0, si_status=0, si_utime=0, si_stime=0} ---
rt_sigreturn({mask=[]}) = 27395
dup2(13, 1) = 1
close(13) = 0
dup2(14, 2) = 2
close(14) = 0
pipe([3, 4]) = 0
clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7fab5c6479d0) = 27396
close(4) = 0
read(3, "\33[31m", 128) = 5
read(3, "", 128) = 0
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=27396, si_uid=0, si_status=0, si_utime=0, si_stime=0} ---
rt_sigreturn({mask=[]}) = 0
close(3) = 0
wait4(-1, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 27396
pipe([3, 4]) = 0
clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7fab5c6479d0) = 27397
close(4) = 0
read(3, "\33[33m", 128) = 5
read(3, "", 128) = 0
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=27397, si_uid=0, si_status=0, si_utime=0, si_stime=0} ---
rt_sigreturn({mask=[]}) = 0
close(3) = 0
wait4(-1, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 27397
pipe([3, 4]) = 0
clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7fab5c6479d0) = 27398
close(4) = 0
read(3, "\33[39;49m", 128) = 8
read(3, "", 128) = 0
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=27398, si_uid=0, si_status=0, si_utime=0, si_stime=0} ---
rt_sigreturn({mask=[]}) = 0
close(3) = 0
wait4(-1, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 27398
write(1, ".\n", 2.
) = 2
close(12) = 0
close(11) = 0
exit_group(0) = ?
+++ exited with 0 +++
We figured it out: You are using an old init script!
Why? Because we package out init scrips on Debian separately and we forgot to update 💃
This is not an old script, it's the init script for Debian based systems. We need to figure out the issue and fix it in https://github.com/Icinga/deb-icinga2/issues/2
Edit: Please discuss there
Sadly @lazyfrosch is right, this is not related to our changes. But since this is directly related to the debian init scripts, I'm closing this in favor of https://github.com/Icinga/deb-icinga2/issues/4
Let's come back to the original problem, since I can't reproduce it with Debian stretch in sysV init mode.
I've installed a fresh Debian stretch, rebooted it with sysV and configured Icinga 2 + IDO MySQL
root@debian:~# icinga2 --version
icinga2 - The Icinga 2 network monitoring daemon (version: v2.8.1-430-gc7ae986d9)
Copyright (c) 2012-2018 Icinga Development Team (https://www.icinga.com/)
License GPLv2+: GNU GPL version 2 or later <http://gnu.org/licenses/gpl2.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Application information:
Installation root: /usr
Sysconf directory: /etc
Run directory: /run
Local state directory: /var
Package data directory: /usr/share/icinga2
State path: /var/lib/icinga2/icinga2.state
Modified attributes path: /var/lib/icinga2/modified-attributes.conf
Objects path: /var/cache/icinga2/icinga2.debug
Vars path: /var/cache/icinga2/icinga2.vars
PID path: /run/icinga2/icinga2.pid
System information:
Platform: Debian GNU/Linux
Platform version: 9 (stretch)
Kernel: Linux
Kernel version: 4.9.0-4-amd64
Architecture: x86_64
Build information:
Compiler: GNU 6.3.0
Build host: 4451229ca030
root@debian:~# cat /etc/debian_version
9.3
root@debian:~# ps -ef | head -n5
UID PID PPID C STIME TTY TIME CMD
root 1 0 0 16:05 ? 00:00:00 init [2]
root 2 0 0 16:05 ? 00:00:00 [kthreadd]
root 3 2 0 16:05 ? 00:00:00 [ksoftirqd/0]
root 5 2 0 16:05 ? 00:00:00 [kworker/0:0H]
root@debian:~# dpkg -S /sbin/init
sysvinit-core: /sbin/init
Now let's see how restarting does:
root@debian:~# /etc/init.d/icinga2 start
[ ok ] checking Icinga2 configuration.
[....] Starting icinga2 monitoring daemon: icinga2[2018-02-19 16:52:33 +0100] information/cli: Icinga application loader (version: v2.8.1-430-gc7ae986d9)
[2018-02-19 16:52:33 +0100] information/cli: Loading configuration file(s).
[2018-02-19 16:52:33 +0100] information/ConfigItem: Committing config item(s).
[2018-02-19 16:52:33 +0100] information/ApiListener: My API identity: debian
[2018-02-19 16:52:33 +0100] information/ConfigItem: Instantiated 12 Services.
[2018-02-19 16:52:33 +0100] information/ConfigItem: Instantiated 3 ServiceGroups.
[2018-02-19 16:52:33 +0100] information/ConfigItem: Instantiated 1 ScheduledDowntime.
[2018-02-19 16:52:33 +0100] information/ConfigItem: Instantiated 2 HostGroups.
[2018-02-19 16:52:33 +0100] information/ConfigItem: Instantiated 1 Downtime.
[2018-02-19 16:52:33 +0100] information/ConfigItem: Instantiated 1 NotificationComponent.
[2018-02-19 16:52:33 +0100] information/ConfigItem: Instantiated 2 NotificationCommands.
[2018-02-19 16:52:33 +0100] information/ConfigItem: Instantiated 13 Notifications.
[2018-02-19 16:52:33 +0100] information/ConfigItem: Instantiated 1 IcingaApplication.
[2018-02-19 16:52:33 +0100] information/ConfigItem: Instantiated 1 ApiUser.
[2018-02-19 16:52:33 +0100] information/ConfigItem: Instantiated 1 Host.
[2018-02-19 16:52:33 +0100] information/ConfigItem: Instantiated 1 ApiListener.
[2018-02-19 16:52:33 +0100] information/ConfigItem: Instantiated 1 FileLogger.
[2018-02-19 16:52:33 +0100] information/ConfigItem: Instantiated 1 CheckerComponent.
[2018-02-19 16:52:33 +0100] information/ConfigItem: Instantiated 3 Zones.
[2018-02-19 16:52:33 +0100] information/ConfigItem: Instantiated 1 Endpoint.
[2018-02-19 16:52:33 +0100] information/ConfigItem: Instantiated 1 UserGroup.
[2018-02-19 16:52:33 +0100] information/ConfigItem: Instantiated 1 IdoMysqlConnection.
[2018-02-19 16:52:33 +0100] information/ConfigItem: Instantiated 212 CheckCommands.
[2018-02-19 16:52:33 +0100] information/ConfigItem: Instantiated 3 TimePeriods.
[2018-02-19 16:52:33 +0100] information/ConfigItem: Instantiated 1 User.
[2018-02-19 16:52:33 +0100] information/ScriptGlobal: Dumping variables to file '/var/cache/icinga2/icinga2.vars'
[2018-02-19 16:52:33 +0100] information/ConfigObject: Restoring program state from file '/var/lib/icinga2/icinga2.state'
[2018-02-19 16:52:33 +0100] information/ConfigObject: Restored 264 objects. Loaded 0 new objects without state.
[2018-02-19 16:52:33 +0100] information/ConfigItem: Triggering Start signal for config items
[2018-02-19 16:52:33 +0100] information/NotificationComponent: 'notification' started.
[2018-02-19 16:52:33 +0100] information/ApiListener: 'api' started.
[2018-02-19 16:52:33 +0100] information/ApiListener: Adding new listener on port '5665'
[2018-02-19 16:52:33 +0100] information/CheckerComponent: 'checker' started.
[2018-02-19 16:52:33 +0100] information/DbConnection: 'ido-mysql' started.
[2018-02-19 16:52:33 +0100] information/ConfigItem: Activated all objects.
[.ok
root@debian:~# ps -ef | grep icinga2
nagios 14901 1 0 16:52 pts/0 00:00:00 /usr/lib/x86_64-linux-gnu/icinga2/sbin/icinga2 --no-stack-rlimit daemon -d -e /var/log/icinga2/icinga2.err
nagios 14904 1 0 16:52 ? 00:00:00 /usr/lib/x86_64-linux-gnu/icinga2/sbin/icinga2 --no-stack-rlimit daemon -d -e /var/log/icinga2/icinga2.err
root 14932 7167 0 16:52 pts/0 00:00:00 grep icinga2
root@debian:~# /etc/init.d/icinga2 reload
[ ok ] checking Icinga2 configuration.
[ ok ] icinga2 is running.
[ ok ] Reloading icinga2 monitoring daemon: icinga2.
root@debian:~# ps -ef | grep icinga2
nagios 15034 1 5 16:52 ? 00:00:00 /usr/lib/x86_64-linux-gnu/icinga2/sbin/icinga2 --no-stack-rlimit daemon -d -e /var/log/icinga2/icinga2.err --reload-internal 14904
nagios 15050 15034 0 16:52 ? 00:00:00 /usr/lib/x86_64-linux-gnu/icinga2/sbin/icinga2 --no-stack-rlimit daemon -d -e /var/log/icinga2/icinga2.err --reload-internal 14904
root 15059 7167 0 16:52 pts/0 00:00:00 grep icinga2
root@debian:~# /etc/init.d/icinga2 status
[ ok ] icinga2 is running.
@Crunsher Were you able to reproduce the error?
@tmatthaeus what might be different in my setup compared to yours?
@tmatthaeus have you had this problem on any other Stretch/sysV system? I can't reproduce on a fresh install.
Okay so it doesn't look like OP is using sysV init, on closer inspection of his strace, it is systemd.
So the initscript doesn't matter at all...
To my surprise icinga2 is really dying during a reload with systemd:
root@debian:~# systemctl status icinga2.service
● icinga2.service - Icinga host/service/network monitoring system
Loaded: loaded (/lib/systemd/system/icinga2.service; enabled; vendor preset: enabled)
Drop-In: /etc/systemd/system/icinga2.service.d
└─limits.conf
Active: inactive (dead) since Mon 2018-02-26 14:14:43 CET; 794ms ago
Process: 1335 ExecReload=/usr/lib/icinga2/safe-reload /usr/lib/icinga2/icinga2 (code=exited, status=0/SUCCESS)
Process: 1288 ExecStart=/usr/sbin/icinga2 daemon -e ${ICINGA2_ERROR_LOG} (code=exited, status=0/SUCCESS)
Process: 1201 ExecStartPre=/usr/lib/icinga2/prepare-dirs /usr/lib/icinga2/icinga2 (code=exited, status=0/SUCCESS)
Main PID: 1288 (code=exited, status=0/SUCCESS)
Feb 26 14:14:43 debian icinga2[1288]: [2018-02-26 14:14:11 +0100] information/IdoMysqlConnection: 'ido-mysql' resumed.
Feb 26 14:14:43 debian icinga2[1288]: [2018-02-26 14:14:11 +0100] information/IdoMysqlConnection: MySQL IDO instance id: 1 (schema version: '1.14.3')
Feb 26 14:14:43 debian icinga2[1288]: [2018-02-26 14:14:11 +0100] information/IdoMysqlConnection: Finished reconnecting to MySQL IDO database in 0.0316799 second(s).
Feb 26 14:14:43 debian icinga2[1288]: [2018-02-26 14:14:21 +0100] information/WorkQueue: #5 (ApiListener, RelayQueue) items: 0, rate: 0.25/s (15/min 15/5min 15/15min);
Feb 26 14:14:43 debian icinga2[1288]: [2018-02-26 14:14:21 +0100] information/WorkQueue: #6 (ApiListener, SyncQueue) items: 0, rate: 0/s (0/min 0/5min 0/15min);
Feb 26 14:14:43 debian icinga2[1288]: [2018-02-26 14:14:21 +0100] information/WorkQueue: #7 (IdoMysqlConnection, ido-mysql) items: 6, rate: 0.35/s (21/min 21/5min 21/15min);
Feb 26 14:14:43 debian icinga2[1288]: [2018-02-26 14:14:31 +0100] information/WorkQueue: #7 (IdoMysqlConnection, ido-mysql) items: 6, rate: 0.8/s (48/min 48/5min 48/15min);
Feb 26 14:14:43 debian icinga2[1288]: [2018-02-26 14:14:41 +0100] information/WorkQueue: #7 (IdoMysqlConnection, ido-mysql) items: 6, rate: 1.25/s (75/min 75/5min 75/15min);
Feb 26 14:14:43 debian icinga2[1288]: [2018-02-26 14:14:43 +0100] information/Application: Got reload command: Starting new instance.
Feb 26 14:14:43 debian icinga2[1288]: [2018-02-26 14:14:43 +0100] information/Application: Reload requested, letting new process take over.
This is currently only a problem with snapshots it seems.
Yes the systemd file for Debian/Ubuntu differs, but I'm not sure why systemd is loosing the daemon here...
I will switch the systemd script for Debian and Ubuntu to the one included with Icinga 2.
Meanwhile we should make sure that the daemon updates the PID file correctly:
root@debian:~# systemctl restart icinga2.service
root@debian:~# ls -al /run/icinga2/icinga2.pid
-rw-rw---- 1 nagios nagios 5 Feb 26 14:54 /run/icinga2/icinga2.pid
root@debian:~# systemctl reload icinga2.service
root@debian:~# ls -al /run/icinga2/icinga2.pid
ls: cannot access '/run/icinga2/icinga2.pid': No such file or directory
@Crunsher Could you have a look at this?
Note: Affects current master starting with c418a9611e and probably 2.8.2 then
It doesn't happen in my custom debug build. So it is likely there might be some permission problem or similar. I'll see to this getting looked at before 2.8.2 hits.
jflach@jfws ~/git/icinga2/build$ sudo systemctl start icinga2 ✭ master
jflach@jfws ~/git/icinga2/build$ sudo systemctl reload icinga2 ✭ master
jflach@jfws ~/git/icinga2/build$ sudo systemctl status icinga2 ✭ master
● icinga2.service - Icinga host/service/network monitoring system
Loaded: loaded (/usr/lib/systemd/system/icinga2.service; disabled; vendor preset: enabled)
Active: active (running) since Tue 2018-02-27 10:38:09 CET; 12s ago
Process: 21546 ExecReload=/home/jflach/i2/lib/icinga2/safe-reload /home/jflach/i2/etc/sysconfig/icinga2 (code=exited, status=0/SUCCESS)
Process: 21396 ExecStartPre=/home/jflach/i2/lib/icinga2/prepare-dirs /home/jflach/i2/etc/sysconfig/icinga2 (code=exited, status=0/SUCCESS)
Main PID: 21615 (icinga2)
Tasks: 16 (limit: 4915)
CGroup: /system.slice/icinga2.service
├─21615 /home/jflach/i2/lib/icinga2/sbin/icinga2 --no-stack-rlimit daemon -e $ICINGA2_LOG_DIR/error.log --reload-internal 21400
└─21644 /home/jflach/i2/lib/icinga2/sbin/icinga2 --no-stack-rlimit daemon -e $ICINGA2_LOG_DIR/error.log --reload-internal 21400
Feb 27 10:38:16 jfws icinga2[21400]: [2018-02-27 10:38:09 +0100] information/ConfigItem: Activated all objects.
Feb 27 10:38:16 jfws icinga2[21400]: [2018-02-27 10:38:12 +0100] critical/TcpSocket: Invalid socket: No route to host
Feb 27 10:38:16 jfws icinga2[21400]: [2018-02-27 10:38:12 +0100] critical/ApiListener: Cannot connect to host '192.168.225.200' on port '5665'
Feb 27 10:38:16 jfws icinga2[21400]: [2018-02-27 10:38:12 +0100] information/ApiListener: Finished reconnecting to endpoint 'WingdingsII' via host '192.168.225.200' and port '5665'
Feb 27 10:38:16 jfws icinga2[21400]: [2018-02-27 10:38:16 +0100] information/Application: Got reload command: Starting new instance.
Feb 27 10:38:16 jfws icinga2[21400]: [2018-02-27 10:38:16 +0100] information/Application: Reload requested, letting new process take over.
Feb 27 10:38:16 jfws systemd[1]: icinga2.service: Supervising process 21615 which is not our child. We'll most likely not notice when it exits.
Feb 27 10:38:17 jfws systemd[1]: Reloaded Icinga host/service/network monitoring system.
Feb 27 10:38:20 jfws icinga2[21615]: Invalid socket: No route to host
Feb 27 10:38:20 jfws icinga2[21615]: Cannot connect to host '192.168.225.200' on port '5665'
jflach@jfws ~/git/icinga2/build$ stat ~/i2/var/run/icinga2/icinga2.pid ✭ master
File: /home/jflach/i2/var/run/icinga2/icinga2.pid
Size: 6 Blocks: 8 IO Block: 4096 regular file
Device: fe01h/65025d Inode: 6823849 Links: 1
Access: (0644/-rw-r--r--) Uid: ( 1000/ jflach) Gid: ( 1000/ jflach)
Access: 2018-02-27 10:38:16.756144593 +0100
Modify: 2018-02-27 10:38:17.008146410 +0100
Change: 2018-02-27 10:38:17.008146410 +0100
Birth: -
jflach@jfws ~/git/icinga2/build$ icinga2 --version ✭ master
icinga2 - The Icinga 2 network monitoring daemon (version: v2.8.1-488-g98bcca5e1; debug)
Update: I thought this was a 2.8.2 issue but it's not :woman_shrugging:
I think the core problem with this issue is that we do not longer update the PID file on SIGUSR2 takeover.
Can we discuss this tomorrow in person? I'd call Stop() inside SigUsr2 so the PID file is updated before exiting the prior daemon. (Current takeover for systemd and sysV)
I am not able to reproduce this:
jflach@jfws ~/git/icinga2/build$ ~/i2/etc/init.d/icinga2 start
Checking configuration: Done
Starting Icinga 2: Done
jflach@jfws ~/git/icinga2/build$ ~/i2/etc/init.d/icinga2 status
Icinga 2 status: Running
jflach@jfws ~/git/icinga2/build$ cat ~/i2/var/run/icinga2/icinga2.pid
25409
jflach@jfws ~/git/icinga2/build$ ~/i2/etc/init.d/icinga2 reload
Validating config files: Done
Reloading Icinga 2: Done
jflach@jfws ~/git/icinga2/build$ cat ~/i2/var/run/icinga2/icinga2.pid
25771
jflach@jfws ~/git/icinga2/build$ ~/i2/etc/init.d/icinga2 status
Icinga 2 status: Running
jflach@jfws ~/git/icinga2/build$ icinga2 version
icinga2 - The Icinga 2 network monitoring daemon (version: v2.8.1-526-g9b0fccfd8; debug)
We can easily replace our call to Exit with a call to Stop, though I'd like a way to test this first.
The user is using systemd! systemd is expecting that the PID file is updated before the old daemon is exiting.
We seem to no longer do that. So the "old" systemd unit fails.
But we should make sure this still works, despite updating the unit file...
With which, as I mentioned earlier, I'm unable to reproduce this either.
After talking with @gunnarbeutner this seems to be timing related. I'll investigate further and see if Stop() is enough or if we need to take additional care to make sure the new PID file is written before we kill the old process.