Icinga2: icinga 2.11 exits without errors on FreeBSD 11.3

Created on 25 Sep 2019  ·  110Comments  ·  Source: Icinga/icinga2

Describe the bug

After upgrading from icinga 2.10.5 to 2.11 on FreeBSD 11.3-p3, icinga2 daemon -C shows that the configuration is correct, but starts and immediately exits when the api-feature is enabled. It works without the api-feature.

After re-running the api setup, I got it working but it crashed when I tried to send a notification.

output from running truss icinga2 daemon -x debug before 'api setup'

[2019-09-25 08:50:29 +0200] information/DbConnection: 'ido-mysql' started.                                                                                                                                                                    
[2019-09-25 08:50:29 +0200] information/ExternalCommandListener: 'command' started.                                                                                                                                                           
[2019-09-25 08:50:29 +0200] warning/ExternalCommandListener: This feature is DEPRECATED and will be removed in future releases. Check the roadmap at https://github.com/Icinga/icinga2/milestones                                             
Context:                                                                                                                                                                                                                                      
        (0) Activating object 'command' of type 'ExternalCommandListener'                                                                                                                                                                     

[2019-09-25 08:50:29 +0200] information/NotificationComponent: 'notification' started.                                                                                                                                                        
[2019-09-25 08:50:29 +0200] information/CheckerComponent: 'checker' started.                                                                                                                                                                  
[2019-09-25 08:50:29 +0200] information/ConfigItem: Activated all objects.
[2019-09-25 08:50:29 +0200] notice/WorkQueue: Stopped WorkQueue threads for 'DaemonCommand::Run'
nanosleep({ 0.200000000 })                       = 0 (0x0)
wait4(60659,{ SIGNALED,sig=64 },WNOHANG,0x0)     = 0 (0x0)
nanosleep({ 0.200000000 })                       ERR#4 'Interrupted system call'
SIGNAL 20 (SIGCHLD) code=CLD_KILLED pid=60659 uid=183 status=11
sigprocmask(SIG_SETMASK,{ SIGCHLD },0x0)         = 0 (0x0)
sigreturn(0x7fffffffca80)                        ERR#4 'Interrupted system call'
wait4(60659,{ SIGNALED,sig=SIGSEGV },WNOHANG,0x0) = 60659 (0xecf3)
[2019-09-25 08:50:29 +0200] notice/cli: Seemless worker (PID 60659) stopped, stopping as well
write(1,"[2019-09-25 08:50:29 +0200] \^[["...,103) = 103 (0x67)
unlink("/var/run/icinga2/icinga2.pid")           = 0 (0x0)
close(11)                                        = 0 (0x0)
_umtx_op(0x8010b2038,UMTX_OP_WAIT_UINT_PRIVATE,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x8010b2098,UMTX_OP_WAIT_UINT_PRIVATE,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x8010b20e0,UMTX_OP_WAIT_UINT_PRIVATE,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x8010b20f8,UMTX_OP_WAIT_UINT_PRIVATE,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x8010b2110,UMTX_OP_WAIT_UINT_PRIVATE,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x8010b2140,UMTX_OP_WAIT_UINT_PRIVATE,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x8010b2068,UMTX_OP_WAIT_UINT_PRIVATE,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x8010b2128,UMTX_OP_WAIT_UINT_PRIVATE,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x8054182b8,UMTX_OP_NWAKE_PRIVATE,0x18,0x0,0x0) = 0 (0x0)
_umtx_op(0x8010b2230,UMTX_OP_WAIT_UINT_PRIVATE,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x8010b2170,UMTX_OP_WAIT_UINT_PRIVATE,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x8010b2158,UMTX_OP_WAIT_UINT_PRIVATE,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x8010b2200,UMTX_OP_WAIT_UINT_PRIVATE,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x8010b2188,UMTX_OP_WAIT_UINT_PRIVATE,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x8010b20c8,UMTX_OP_WAIT_UINT_PRIVATE,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x8010b21a0,UMTX_OP_WAIT_UINT_PRIVATE,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x8010b21e8,UMTX_OP_WAIT_UINT_PRIVATE,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x8010b2008,UMTX_OP_WAIT_UINT_PRIVATE,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x8010b21d0,UMTX_OP_WAIT_UINT_PRIVATE,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x8010b2080,UMTX_OP_WAIT_UINT_PRIVATE,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x8010b2218,UMTX_OP_WAIT_UINT_PRIVATE,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x8010b2050,UMTX_OP_WAIT_UINT_PRIVATE,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x8010b20b0,UMTX_OP_WAIT_UINT_PRIVATE,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x80552b5e0,UMTX_OP_MUTEX_WAIT,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x80552b5e0,UMTX_OP_MUTEX_WAIT,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x8010b2020,UMTX_OP_WAIT_UINT_PRIVATE,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x8010b21b8,UMTX_OP_WAIT_UINT_PRIVATE,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x8055c09e0,UMTX_OP_MUTEX_WAIT,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x8055c09e0,UMTX_OP_MUTEX_WAKE2,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x80552b5e0,UMTX_OP_MUTEX_WAKE2,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x80552b5e0,UMTX_OP_MUTEX_WAIT,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x80552b5e0,UMTX_OP_MUTEX_WAKE2,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x80552b5e0,UMTX_OP_MUTEX_WAIT,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x80552b5e0,UMTX_OP_MUTEX_WAKE2,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x80552b5e0,UMTX_OP_MUTEX_WAIT,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x80552b5e0,UMTX_OP_MUTEX_WAKE2,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x80552b5e0,UMTX_OP_MUTEX_WAKE2,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x8055c09e0,UMTX_OP_MUTEX_WAIT,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x8055c09e0,UMTX_OP_MUTEX_WAIT,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x80552b5e0,UMTX_OP_MUTEX_WAIT,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x80552b5e0,UMTX_OP_MUTEX_WAIT,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x8055c09e0,UMTX_OP_MUTEX_WAKE2,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x80552b5e0,UMTX_OP_MUTEX_WAKE2,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x8055c09e0,UMTX_OP_MUTEX_WAIT,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x80552b5e0,UMTX_OP_MUTEX_WAIT,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x80552b5e0,UMTX_OP_MUTEX_WAKE2,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x80552b5e0,UMTX_OP_MUTEX_WAIT,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x8055c09e0,UMTX_OP_MUTEX_WAKE2,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x8055c09e0,UMTX_OP_MUTEX_WAIT,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x80552b5e0,UMTX_OP_MUTEX_WAIT,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x80552b5e0,UMTX_OP_MUTEX_WAKE2,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x80552b5e0,UMTX_OP_MUTEX_WAKE2,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x80552b5e0,UMTX_OP_MUTEX_WAIT,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x80552b5e0,UMTX_OP_MUTEX_WAIT,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x80552b5e0,UMTX_OP_MUTEX_WAKE2,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x80552b5e0,UMTX_OP_MUTEX_WAKE2,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x80552b5e0,UMTX_OP_MUTEX_WAIT,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x8055c09e0,UMTX_OP_MUTEX_WAIT,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x8055c09e0,UMTX_OP_MUTEX_WAIT,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
…
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
munmap(0x8010c0000,4096)                         = 0 (0x0)                                                             
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
<thread 102033 exited>                                                                                                 
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
munmap(0x8010be000,4096)                         = 0 (0x0)                                                             
<thread 102025 exited>                                                                                                 
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
munmap(0x8010bf000,4096)                         = 0 (0x0)                                                             
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
<thread 102032 exited>                                                                                                                                                                                                               [72/4901]
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
munmap(0x8010c3000,4096)                         = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
<thread 102096 exited>
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
munmap(0x8010c5000,4096)                         = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
<thread 102213 exited>
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
munmap(0x8010c9000,4096)                         = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
<thread 102241 exited>
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
munmap(0x8010c8000,4096)                         = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
<thread 102240 exited>
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
munmap(0x8010ca000,4096)                         = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x805483e00,UMTX_OP_WAIT,0x18f63,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
<thread 102243 exited>
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
munmap(0x8010b9000,4096)                         = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)                                                                                                                                                                        [18/4901]
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
<thread 101153 exited>
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
munmap(0x8010b4000,4096)                         = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
<thread 100701 exited>
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
munmap(0x8010bd000,4096)                         = 0 (0x0)
munmap(0x8010bc000,4096)                         = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
<thread 102024 exited>
munmap(0x8010bb000,4096)                         = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
<thread 101278 exited>
<thread 101274 exited>
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
munmap(0x8010b8000,4096)                         = 0 (0x0)
<thread 101144 exited>
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
munmap(0x8010c7000,4096)                         = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
munmap(0x8010c6000,4096)                         = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
<thread 102239 exited>
_umtx_op(0x805484d00,UMTX_OP_WAIT,0x18f5f,0x0,0x0) = 0 (0x0)
munmap(0x8010ba000,4096)                         = 0 (0x0)
<thread 102236 exited>
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
<thread 101163 exited>
munmap(0x8010b7000,4096)                         = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
munmap(0x8010b3000,4096)                         = 0 (0x0)
<thread 100918 exited>
<thread 100489 exited>
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
munmap(0x8010c1000,4096)                         = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
munmap(0x8010b6000,4096)                         = 0 (0x0)
<thread 102035 exited>
<thread 100805 exited>
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
munmap(0x8010c2000,4096)                         = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
munmap(0x8010b5000,4096)                         = 0 (0x0)
<thread 102036 exited>
<thread 100799 exited>
munmap(0x8010c4000,4096)                         = 0 (0x0)
<thread 102158 exited>
_umtx_op(0x805485c00,UMTX_OP_WAIT,0x18f0e,0x0,0x0) = 0 (0x0)
exit(0x8b)
process exit, rval = 139

crash

  Application version: r2.11.0-1

System information:
  Platform: Unknown
  Platform version: Unknown
  Kernel: FreeBSD
  Kernel version: 11.3-RELEASE-p3
  Architecture: amd64

Build information:
  Compiler: Clang 8.0.0
  Build host: ic-11_3-RELEASE-HEAD-job-03

Application information:

General paths:
  Config directory: /usr/local/etc/icinga2
  Data directory: /var/lib/icinga2
  Log directory: /var/log/icinga2
  Cache directory: /var/cache/icinga2
  Spool directory: /var/spool/icinga2
  Run directory: /var/run/icinga2

Old paths (deprecated):
  Installation root: /usr/local
  Sysconf directory: /usr/local/etc
  Run directory (base): /var/run
  Local state directory: /var

Internal paths:
  Package data directory: /usr/local/share/icinga2
  State path: /var/lib/icinga2/icinga2.state
  Modified attributes path: /var/lib/icinga2/modified-attributes.conf
  Objects path: /var/cache/icinga2/icinga2.debug
  Vars path: /var/cache/icinga2/icinga2.vars
  PID path: /var/run/icinga2/icinga2.pid

Error: Function call 'send' failed with error code 32, 'Broken pipe'


***
* This would indicate a runtime problem or configuration error. If you believe this is a bug in Icinga 2
* please submit a bug report at https://github.com/Icinga/icinga2 and include this stack trace as well as any other
* information that might be useful in order to reproduce this problem.
***
quit: No such file or directory.
ptrace: Operation not permitted.
//65707: No such file or directory.

Your Environment

Include as many relevant details about the environment you experienced the problem in

  • Version used (icinga2 --version): r2.11.0-1
  • Operating System and version: FreeBSD 11.3-p3
  • Enabled features (icinga2 feature list): api checker command ido-mysql mainlog notification
  • Icinga Web 2 version and modules (System - About):

|Version | 2.7.1|
|-- | --|
|Git Commit|f98f988aff19fd797531e4a0555e872ae3155142 |
|PHP Version|7.2.22|
|cube | 1.0.1|
|doc | 2.7.1|
|iframe | 0.0.0|
|ipl | v0.1.1|
|monitoring | 2.7.1|
|reactbundle | v0.4.1|
|setup | 2.7.1|
|unicorn | 1.0.2|
|x509 | 1.0.0|

  • Config validation (icinga2 daemon -C):
[2019-09-25 12:26:53 +0200] information/cli: Icinga application loader (version: r2.11.0-1)
[2019-09-25 12:26:53 +0200] information/cli: Loading configuration file(s).
[2019-09-25 12:26:54 +0200] information/ConfigItem: Committing config item(s).
[2019-09-25 12:26:54 +0200] information/ApiListener: My API identity: ic.ops.eusc.inter.net
[2019-09-25 12:26:59 +0200] warning/ApplyRule: Apply rule '' (in /usr/local/etc/icinga2/zones.d/icinga2-global-templates/generic_services_icagent/raid_win.conf: 1:0-1:59) for type 'Service' does not match anywhere!
[2019-09-25 12:26:59 +0200] warning/ApplyRule: Apply rule '' (in /usr/local/etc/icinga2/zones.d/icinga2-global-templates/generic_services_icagent/replica_win.conf: 1:0-1:65) for type 'Service' does not match anywhere!
[2019-09-25 12:26:59 +0200] warning/ApplyRule: Apply rule 'cibix_tcp' (in /usr/local/etc/icinga2/zones.d/icinga2-master-cluster/platform/commands/cibix_tcp.conf: 1:0-1:24) for type 'Service' does not match anywhere!
[2019-09-25 12:26:59 +0200] information/ConfigItem: Instantiated 84 Comments.
[2019-09-25 12:26:59 +0200] information/ConfigItem: Instantiated 3138 Dependencies.
[2019-09-25 12:26:59 +0200] information/ConfigItem: Instantiated 362 Downtimes.
[2019-09-25 12:26:59 +0200] information/ConfigItem: Instantiated 19687 Notifications.
[2019-09-25 12:26:59 +0200] information/ConfigItem: Instantiated 207 ScheduledDowntimes.
[2019-09-25 12:26:59 +0200] information/ConfigItem: Instantiated 5098 Services.
[2019-09-25 12:26:59 +0200] information/ConfigItem: Instantiated 3 ServiceGroups.
[2019-09-25 12:26:59 +0200] information/ConfigItem: Instantiated 5 TimePeriods.
[2019-09-25 12:26:59 +0200] information/ConfigItem: Instantiated 13 Users.
[2019-09-25 12:26:59 +0200] information/ConfigItem: Instantiated 54 UserGroups.
[2019-09-25 12:26:59 +0200] information/ConfigItem: Instantiated 1 IdoMysqlConnection.
[2019-09-25 12:26:59 +0200] information/ConfigItem: Instantiated 232 CheckCommands.
[2019-09-25 12:26:59 +0200] information/ConfigItem: Instantiated 33 HostGroups.
[2019-09-25 12:26:59 +0200] information/ConfigItem: Instantiated 659 Hosts.
[2019-09-25 12:26:59 +0200] information/ConfigItem: Instantiated 1 FileLogger.
[2019-09-25 12:26:59 +0200] information/ConfigItem: Instantiated 1 ExternalCommandListener.
[2019-09-25 12:26:59 +0200] information/ConfigItem: Instantiated 1 ApiListener.
[2019-09-25 12:26:59 +0200] information/ConfigItem: Instantiated 3 ApiUsers.
[2019-09-25 12:26:59 +0200] information/ConfigItem: Instantiated 10 Endpoints.
[2019-09-25 12:26:59 +0200] information/ConfigItem: Instantiated 1 CheckerComponent.
[2019-09-25 12:26:59 +0200] information/ConfigItem: Instantiated 10 Zones.
[2019-09-25 12:26:59 +0200] information/ConfigItem: Instantiated 1 EventCommand.
[2019-09-25 12:26:59 +0200] information/ConfigItem: Instantiated 1 NotificationComponent.
[2019-09-25 12:26:59 +0200] information/ConfigItem: Instantiated 1 IcingaApplication.
[2019-09-25 12:26:59 +0200] information/ConfigItem: Instantiated 6 NotificationCommands.
[2019-09-25 12:26:59 +0200] information/ScriptGlobal: Dumping variables to file '/var/cache/icinga2/icinga2.vars'
[2019-09-25 12:26:59 +0200] information/cli: Finished validating the configuration file(s).

additional context

I opened a thread on the community discourse where I might have wrote more:
https://community.icinga.com/t/problems-with-upgrading-icinga-2-10-5-to-2-11-on-freebsd/2325

needs-feedback

Most helpful comment

This should be fixed now when all nodes run 2.11.3. I just updated the FreeBSD port to 2.11.3, so please test it. :)

All 110 comments

@bsdlme can you confirm that behaviour please? I'm not sure how FreeBSD handles the umbrella process and reloads here. Or maybe it is a problem with boost asio & context on BSD specifically.

@nielsk: does dmesg(1) show a SIGBUS error for the icinga2 process?

I have the same problem, I described it a bit more in FreeBSD #240812. I discovered after running dmesg that icinga2 was diying with a SIGBUS.

Is there a difference if you omit -d during that run?

@nielsk is running FreeBSD 11.3 / amd64. @mat813 11.2 / i386 both with the API feature enabled.

I was successfully running 2.11.0 on FreeBSD 12.0 / amd64 with API feature.

So the problematic case seems to be API on 11.x

Nope. SIGSEGV in my case (at least I see a lot of signal 11 in my dmesg, thus this should be from my experiments getting it to work)

On 25. Sep 2019, at 14:20, Lars E notifications@github.com wrote:


@nielsk: does dmesg(1) show a SIGBUS error for the icinga2 process?


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.

@bsdlme which boost versions are provided with 11.2 & 3?

1.71 is in ports

I'm not a FreeBSD user, what else differs between 11.3 and 12 in terms of compiler versions, cmake, build flags, openssl versions, etc. in specific regard to Icinga dependencies?

FreeBSD 11.2 (@mat813): clang 6.0.0, OpenSSL 1.0.2o
FreeBSD 11.3 (@nielsk): clang 8.0.0, OpenSSL 1.0.2s
FreeBSD 12.0 (@bsdlme): clang 6.0.1, OpenSSL 1.1.1a

11.3 was released after 12.0 that's why it has a newer clang version.

Cmake is not in base but installed from ports. Ports have the same version for all FreeBSD versions. The latest cmake in ports is cmake-3.15.3, probably used by all of us.

CFLAGS are:

-DBOOST_COROUTINES_NO_DEPRECATION_WARNING -DBOOST_FILESYSTEM_NO_DEPRECATED -Ithird-party/nlohmann_json -Ithird-party/utf8cpp/source -I. -Ilib -O2 -pipe  -fstack-protector-strong -isystem /usr/local/include -fno-strict-aliasing -Qunused-arguments -fcolor-diagnostics -pthread -Winvalid-pch -O2 -pipe  -fstack-protector-strong -isystem /usr/local/include -fno-strict-aliasing -MD -MT 

Thanks. As far as I can see, 11.x is still supported. https://www.freebsd.org/security/#sup

@bsdlme How difficult is it for you to spin up 11.3 and test this?

I can create a VM at Azure with 11.3. If you like I can give you the login credentials, so you can play around yourself.

I would but unfortunately I have no time atm. I'm merely interested in the fact if you can reproduce this by yourself, and do the backtrace dance. I don't remember whether FreeBSD has gdb or lldb though.

Now I set up a 11.3 amd64 VM and installed 2.11 using packages.
I needed to change permissions on /usr/local/etc/icinga2 so that the icinga group has write permissions to it (this has changed in 2.11). After that I set a ticket salt, ran "icinga2 api setup" and "icinga2 feature enable api" and could start Icinga using the rc script. I does not crash for me and I am able to use curl to connect to the API port.

So, what can I do to debug this further?
I had set the directory to write-permissions as well because otherwise it wouldn't start in the first place.
I just tried the upgrade again, chowned everything in /usr/local/etc/icinga2 to icinga and get a signal 11.
According to truss right after the icinga-satellites and an agent started connecting to icinga 2.11

[2019-09-30 21:30:01 +0200] information/ApiListener: Started new listener on '[::]:5665'
[2019-09-30 21:30:01 +0200] information/ApiListener: Reconnecting to endpoint 'sat1.fqdn' via host 'sat1.fqdn' and port '5665'
[2019-09-30 21:30:01 +0200] information/ApiListener: Reconnecting to endpoint 'sat2.fqdn' via host 'sat2.fqdn' and port '5665'
[2019-09-30 21:30:01 +0200] information/ApiListener: Reconnecting to endpoint 'sat3.fqdn' via host 'sat3.fqdn' and port '5665'
[2019-09-30 21:30:01 +0200] information/ApiListener: Reconnecting to endpoint 'sat4.fqdn' via host 'sat4.fqdn' and port '5665'
[2019-09-30 21:30:01 +0200] information/ApiListener: Reconnecting to endpoint 'sat5.fqdn' via host 'sat5.fqdn' and port '5665'
[2019-09-30 21:30:01 +0200] information/ApiListener: Reconnecting to endpoint 'sat6.fqdn' via host 'sat6.fqdn' and port '5665'
[2019-09-30 21:30:01 +0200] information/ApiListener: Reconnecting to endpoint 'icinga-agent1' via host 'icinga-agent1' and port '5665'
nanosleep({ 0.200000000 })                       ERR#4 'Interrupted system call'
SIGNAL 20 (SIGCHLD) code=CLD_KILLED pid=38116 uid=183 status=11
sigprocmask(SIG_SETMASK,{ SIGCHLD },0x0)         = 0 (0x0)
sigreturn(0x7fffffffcac0)                        ERR#4 'Interrupted system call'
wait4(38116,{ SIGNALED,sig=SIGSEGV },WNOHANG,0x0) = 38116 (0x94e4)
unlink("/var/run/icinga2/icinga2.pid")           = 0 (0x0)
close(11)                                        = 0 (0x0)
_umtx_op(0x8010b2020,UMTX_OP_WAIT_UINT_PRIVATE,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x8010b2098,UMTX_OP_WAIT_UINT_PRIVATE,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x8010b2170,UMTX_OP_WAIT_UINT_PRIVATE,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x8010b2140,UMTX_OP_WAIT_UINT_PRIVATE,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x8010b20f8,UMTX_OP_WAIT_UINT_PRIVATE,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x8010b21d0,UMTX_OP_WAIT_UINT_PRIVATE,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x8010b2158,UMTX_OP_WAIT_UINT_PRIVATE,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x8055c09e0,UMTX_OP_MUTEX_WAIT,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x8010b2128,UMTX_OP_WAIT_UINT_PRIVATE,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x8010b2110,UMTX_OP_WAIT_UINT_PRIVATE,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x8010b21e8,UMTX_OP_WAIT_UINT_PRIVATE,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x8054182b8,UMTX_OP_NWAKE_PRIVATE,0x18,0x0,0x0) = 0 (0x0)
_umtx_op(0x8010b2080,UMTX_OP_WAIT_UINT_PRIVATE,0x0,0x0,0x0) = 0 (0x0)

I tried now switching for icinga2 back to the freebsd-pkg-repo instead of our own and it still crashes with the same output as above

You could try by using the sample config and adding more and more of your config and see when it starts to crash.

It would be easier to set up a new server with an officially supported linux-distribution and migrate the config than doing this. I have to speak with my team about it.

Or upgrade to 12.0-RELEASE.

How? I am using 11.3. You cannot upgrade from 11.3 to 12.0 because of the new zfs-features in 11.3.

On 1. Oct 2019, at 16:32, Lars E notifications@github.com wrote:


Or upgrade to 12.0-RELEASE.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.

I don't have much time atm, one thought is how the Boost libraries are compiled on your system. There could be specific hardening compiler flags which create troubles here, or specific stack guard patches which are wrong in the way how Boost Coroutine and Context work. See my analysis for the Nessus scan crashes in #7431.

Since it always crashes on TLS connection start, this would be the place where I'd start debugging. Maybe also the OpenSSL version/linkage on FreeBSD causes trouble here.

How? I am using 11.3. You cannot upgrade from 11.3 to 12.0 because of the new zfs-features in 11.3.

Oh, I see. Then you could upgrade to 12.1-BETA2 or wait for 12.1-RC1 which will be released on Oct, 11.
Or install 12.0-RELEASE and migrate the data.
Unfortunately I don't have any clue of C++, so I can't debug this any further...

Chiming in with the same problem FreeBSD 11.3 here. I just upgraded to 2.11.0 from 2.10.5, and now I'm also getting this SIGV (11), but I get the same thing from truss:

[2019-10-13 16:47:09 -0700] notice/JsonRpcConnection: Received 'log::SetLogPosition' message from identity 'teraraid.dream-tech.com'.
wait4(62587,{ SIGNALED,sig=64 },WNOHANG,0x0)     = 0 (0x0)
nanosleep({ 0.200000000 })           = 0 (0x0)
wait4(62587,{ SIGNALED,sig=64 },WNOHANG,0x0)     = 0 (0x0)
nanosleep({ 0.200000000 })           = 0 (0x0)
wait4(62587,{ SIGNALED,sig=64 },WNOHANG,0x0)     = 0 (0x0)
nanosleep({ 0.200000000 })           = 0 (0x0)
wait4(62587,{ SIGNALED,sig=64 },WNOHANG,0x0)     = 0 (0x0)
nanosleep({ 0.200000000 })           = 0 (0x0)
wait4(62587,{ SIGNALED,sig=64 },WNOHANG,0x0)     = 0 (0x0)
nanosleep({ 0.200000000 })           = 0 (0x0)
wait4(62587,{ SIGNALED,sig=64 },WNOHANG,0x0)     = 0 (0x0)
nanosleep({ 0.200000000 })           = 0 (0x0)
wait4(62587,{ SIGNALED,sig=64 },WNOHANG,0x0)     = 0 (0x0)
nanosleep({ 0.200000000 })           = 0 (0x0)
wait4(62587,{ SIGNALED,sig=64 },WNOHANG,0x0)     = 0 (0x0)
nanosleep({ 0.200000000 })           = 0 (0x0)
wait4(62587,{ SIGNALED,sig=64 },WNOHANG,0x0)     = 0 (0x0)
nanosleep({ 0.200000000 })           = 0 (0x0)
wait4(62587,{ SIGNALED,sig=64 },WNOHANG,0x0)     = 0 (0x0)
nanosleep({ 0.200000000 })           = 0 (0x0)
wait4(62587,{ SIGNALED,sig=64 },WNOHANG,0x0)     = 0 (0x0)
nanosleep({ 0.200000000 })           = 0 (0x0)
wait4(62587,{ SIGNALED,sig=64 },WNOHANG,0x0)     = 0 (0x0)
nanosleep({ 0.200000000 })           = 0 (0x0)
wait4(62587,{ SIGNALED,sig=64 },WNOHANG,0x0)     = 0 (0x0)
nanosleep({ 0.200000000 })           = 0 (0x0)
wait4(62587,{ SIGNALED,sig=64 },WNOHANG,0x0)     = 0 (0x0)
nanosleep({ 0.200000000 })           = 0 (0x0)
wait4(62587,{ SIGNALED,sig=64 },WNOHANG,0x0)     = 0 (0x0)
nanosleep({ 0.200000000 })           = 0 (0x0)
wait4(62587,{ SIGNALED,sig=64 },WNOHANG,0x0)     = 0 (0x0)
nanosleep({ 0.200000000 })           = 0 (0x0)
wait4(62587,{ SIGNALED,sig=64 },WNOHANG,0x0)     = 0 (0x0)
nanosleep({ 0.200000000 })           ERR#4 'Interrupted system call'
SIGNAL 20 (SIGCHLD) code=CLD_DUMPED pid=62587 uid=183 status=11
sigprocmask(SIG_SETMASK,{ SIGCHLD },0x0)     = 0 (0x0)
sigreturn(0x7fffffffc6c0)            ERR#4 'Interrupted system call'
wait4(62587,{ SIGNALED,sig=SIGSEGV,cored },WNOHANG,0x0) = 62587 (0xf47b)
[2019-10-13 16:47:12 -0700] notice/cli: Seemless worker (PID 62587) stopped, stopping as well
write(1,"[2019-10-13 16:47:12 -0700] \^[["...,103) = 103 (0x67)
unlink("/var/run/icinga2/icinga2.pid")       = 0 (0x0)
close(11)                    = 0 (0x0)

I'm using clang 8.0.0 but LibreSSL 2.9.2. This might at least point away from SSL being the culprit. I can confirm that this crash is only related to being a master, since one of my satellites is running the exact same build but hasn't crashed yet.

@bsdlme - did you have any satellites connected to your test? I suspect that might be necessary so you can see the crash

LibreSSL is something we don't support as the syscalls/APIs may behave differently. We only test OpenSSL. Is this a thing on FreeBSD to set via the ports package?

I'm not sure how to interpret truss, but given that CLD_DUMPED leads to the real error here, is there a possibility to follow child forks? https://vegdave.wordpress.com/2006/10/23/an-example-on-running-truss/ says so.

It may also help to attach gdb/lldb and follow the fork.

You can trace child processes with "truss -f".

LibreSSL is usually a drop in replacement for OpenSSL. We can set a knob when building packages to use that instead of openssl; I can provide more gory details on request. Note that I do not use the normal ports methodology of make; make install as I build far too many packages for too many people. Instead I use poudriere. It is pretty much the same idea with respect to the knob mentioned above.

LibreSSL works 98% of the time; I've built 100s of packages with LibreSSL that work just fine with it including perl, php, nginx, and icinga2. Specific to icinga2, I have it running just fine with LibreSSL at two different sites for the past two years. That being said, there are a few edge case packages that do not build correctly with LibreSSL and these issues are (to my knowledge) handled by the ports system.

I don't think the issue is the LibreSSL api because other users are using the stock OpenSSL api and having the same crash.

So I just backed out to 2.10.5 because I needed it working. I can make some time to try 2.11 again if you are patient with me. :)

I can make some time to try 2.11 again if you are patient with me. :)

Yes please. I'm not able to fix it, @dnsmichi is ENOTIME and we should really try to find the cause.

Thanks in advance!

So, I updated a 11.2 / i386 box to 12.0, and icinga crashes in the same way :(

Is there a way to generate a core dump, or to see a full crash stack trace? The exception with send would indicate it happens between the communication of the main process & process spawn helper. Maybe the last process is gone for some reason.

Ok, so I've set up my live (but personal) monitoring system so I can switch back and forth between the crashing binary and the non-crashing one. I now have a core file, a binary file (I do not know if symbols are in it) and a truss -aefH which on FreeBSD means it shows argument strings, environment strings (in hindsight, I should not have done this one lol), and includes the thread ID.

Here's the shared library rundown:

# ldd icinga2.bin 
icinga2.bin:
    libexecinfo.so.1 => /usr/local/lib/libexecinfo.so.1 (0x801256000)
    libboost_context.so.1.71.0 => /usr/local/lib/libboost_context.so.1.71.0 (0x801465000)
    libboost_coroutine.so.1.71.0 => /usr/local/lib/libboost_coroutine.so.1.71.0 (0x801667000)
    libboost_date_time.so.1.71.0 => /usr/local/lib/libboost_date_time.so.1.71.0 (0x80186e000)
    libboost_filesystem.so.1.71.0 => /usr/local/lib/libboost_filesystem.so.1.71.0 (0x801a78000)
    libboost_thread.so.1.71.0 => /usr/local/lib/libboost_thread.so.1.71.0 (0x801c91000)
    libboost_system.so.1.71.0 => /usr/local/lib/libboost_system.so.1.71.0 (0x801ea9000)
    libboost_program_options.so.1.71.0 => /usr/local/lib/libboost_program_options.so.1.71.0 (0x8020aa000)
    libboost_regex.so.1.71.0 => /usr/local/lib/libboost_regex.so.1.71.0 (0x802308000)
    libboost_chrono.so.1.71.0 => /usr/local/lib/libboost_chrono.so.1.71.0 (0x8025b9000)
    libboost_atomic.so.1.71.0 => /usr/local/lib/libboost_atomic.so.1.71.0 (0x8027c1000)
    libssl.so.47 => /usr/local/lib/libssl.so.47 (0x8029c3000)
    libcrypto.so.45 => /usr/local/lib/libcrypto.so.45 (0x802c1f000)
    libedit.so.0 => /usr/local/lib/libedit.so.0 (0x80300f000)
    libncurses.so.8 => /lib/libncurses.so.8 (0x803246000)
    libc++.so.1 => /usr/lib/libc++.so.1 (0x80349b000)
    libcxxrt.so.1 => /lib/libcxxrt.so.1 (0x80376a000)
    libm.so.5 => /lib/libm.so.5 (0x803989000)
    libgcc_s.so.1 => /lib/libgcc_s.so.1 (0x803bb9000)
    libthr.so.3 => /lib/libthr.so.3 (0x803dcc000)
    libc.so.7 => /lib/libc.so.7 (0x803ff4000)
    libicudata.so.64 => /usr/local/lib/libicudata.so.64 (0x8043af000)
    libicui18n.so.64 => /usr/local/lib/libicui18n.so.64 (0x804600000)
    libicuuc.so.64 => /usr/local/lib/libicuuc.so.64 (0x804b20000)
    librt.so.1 => /usr/lib/librt.so.1 (0x804f0f000)

Interestingly enough, THIS time I ran it, an error got produced from icinga2:

[2019-10-28 13:24:27 -0700] information/cli: Icinga application loader (version: r2.11.0-1)
[2019-10-28 13:24:27 -0700] information/cli: Closing console log.
critical/Application: Error: Function call 'send' failed with error code 32, 'Broken pipe'



Additional information is available in '/var/log/icinga2/crash/report.1572294280.564512'

That crash report appears to be run by a linux oriented script. I've included it and the binary on my webserver. Truss output on request, since I forgot to sanitize it.

https://www.jetcafe.org/icinga2/icinga2.bin
https://www.jetcafe.org/icinga2/icinga2.crashreport

I was able to do an lldb, but I've no idea if this is correct usage. I'm going off of old gdb knowledge and google here:

# lldb icinga2.bin --core icinga2.core
(lldb) target create "icinga2.bin" --core "icinga2.core"
Core file '/tmp/icinga2.core' (x86_64) was loaded.
(lldb) thread backtrace all
* thread #1, name = 'icinga2', stop reason = signal SIGABRT
  * frame #0: 0x00000008040bb9ba libc.so.7`thr_kill + 10
    frame #1: 0x00000008040bb984 libc.so.7`__raise(s=6) at raise.c:52:10
    frame #2: 0x00000008040bb8f9 libc.so.7`abort at abort.c:65:8
    frame #3: 0x00000000004332f7 icinga2.bin`___lldb_unnamed_symbol444$$icinga2.bin + 1127
    frame #4: 0x000000080377e459 libcxxrt.so.1`report_failure(err=<unavailable>, thrown_exception=0x00000008054299c8) at exception.cc:719:5
    frame #5: 0x0000000000467132 icinga2.bin`__cxa_throw + 450
    frame #6: 0x0000000000512104 icinga2.bin`___lldb_unnamed_symbol3994$$icinga2.bin + 52
    frame #7: 0x00000000004cea53 icinga2.bin`___lldb_unnamed_symbol2486$$icinga2.bin + 115
    frame #8: 0x00000000004bc2ca icinga2.bin`___lldb_unnamed_symbol1896$$icinga2.bin + 5498
    frame #9: 0x0000000000482787 icinga2.bin`___lldb_unnamed_symbol1392$$icinga2.bin + 423
    frame #10: 0x000000000041ac1c icinga2.bin`___lldb_unnamed_symbol5$$icinga2.bin + 13436
    frame #11: 0x000000000041773a icinga2.bin`___lldb_unnamed_symbol4$$icinga2.bin + 202
    frame #12: 0x000000000041749d icinga2.bin`___lldb_unnamed_symbol1$$icinga2.bin + 141

I hope this has the information you seek. Feel free to request more detailed information and I will attempt to turn it around as quick as I can (which may be glacial). Thanks in advance for looking at this.

Are there any news yet? Recently my old install broke because boost-libs got updated and my old package wasn't compiled against it. It really seems that icinga2 breaks the moment a satellite tries to connect.

Are there any news yet? Recently my old install broke because boost-libs got updated and my old package wasn't compiled against it. It really seems that icinga2 breaks the moment a satellite tries to connect.

Both of our installations on Jessie broke with the same behaviour after upgrading to r2.11.2-1.
Absolutely no errors on both machines in the HA-Cluster.

One more datapoint here. I recently upgraded a satellite to 2.11.2_1 (from recent 2020Q1 quarterly). I had to restart the master node (which is at 2.10), but it works. I think this bug only happens on a master node.

I had the problem that the master node dies when a satellite tries to connect. When an agent is checked through a 2.11-satellite but the master is 2.10.5 the check results weren’t handed over to the master node (I could see this only because the last check date didn’t change…great when you see after three days that your host wasn’t checked for several days) and I have one agent with 2.11 where it works at the agent even though satellites and masters are only 2.10.5.

On 21. Jan 2020, at 01:46, Dave Hayes notifications@github.com wrote:


One more datapoint here. I recently upgraded a satellite to 2.11.2_1 (from recent 2020Q1 quarterly). I had to restart the master node (which is at 2.10), but it works. I think this bug only happens on a master node.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or unsubscribe.

I do not have a master node running 32 bits FreeBSD, but it happens on all the sattellites I have that are running on i386.
I will try 2.11.2_1.

Just to be clear, all my sites are 64 bit FreeBSD.

I see much the same on OpenBSD i386, a quite current node runs icinga2 2.11.3, some bit older node runs 2.11.2.

I've an OpenBSD arm64 running 2.11.2, and a couple of amd64 running 2.11.3 and 2.11.2 without issues. My master runs 2.11.2 on amd64.

This should be fixed now when all nodes run 2.11.3. I just updated the FreeBSD port to 2.11.3, so please test it. :)

Thanks for the work @bsdlme 👍 Looking forward to the test feedback here.

I will probably do the update next week. The last time I tried it I spent two hours downgrading packages, thus I have to set a bit of time aside.

Thanks to boot environments I decided to do the test today. The problem still exists.

[2020-04-30 10:15:50 +0200] information/ApiListener: Updating configuration file: /var/lib/icinga2/api/zones/icinga2-global-templates//_etc/users/users/raido.conf
[2020-04-30 10:15:50 +0200] information/ApiListener: Updating configuration file: /var/lib/icinga2/api/zones/icinga2-global-templates//_etc/users/users/roehl.conf
[2020-04-30 10:15:50 +0200] information/ApiListener: Updating configuration file: /var/lib/icinga2/api/zones/icinga2-global-templates//_etc/users/users/tis.conf.DISABLED
[2020-04-30 10:15:50 +0200] information/ApiListener: Updating configuration file: /var/lib/icinga2/api/zones/icinga2-global-templates//_etc/users/users/unverricht.conf
[2020-04-30 10:15:50 +0200] notice/ApiListener: Updated meta data for cluster config sync. Checksum: '/var/lib/icinga2/api/zones/icinga2-global-templates/.checksums', timestamp: '/var/lib/icinga2/api/zones/icinga2-global-templa
tes/.timestamp', auth: '/var/lib/icinga2/api/zones/icinga2-global-templates/.authoritative'.                     
[2020-04-30 10:15:50 +0200] information/ApiListener: Started new listener on '[::]:5665'                         
[2020-04-30 10:15:50 +0200] debug/ApiListener: Not connecting to Endpoint 'ic.ops.eusc.inter.net' because that's us.
[2020-04-30 10:15:50 +0200] information/ApiListener: Reconnecting to endpoint 'shaker.ops.eusc.inter.net' via host '195.201.164.235' and port '5665'
[2020-04-30 10:15:50 +0200] debug/ApiListener: Not connecting to Zone 'gin.inx.de' because it's not in the same zone, a parent or a child zone.
[2020-04-30 10:15:50 +0200] information/ApiListener: Reconnecting to endpoint 'ic2-cloud-probe' via host 'ic2-cloud-probe' and port '5665'
[2020-04-30 10:15:50 +0200] information/ApiListener: Reconnecting to endpoint 'saltmaster.interdotnet.de' via host 'saltmaster.interdotnet.de' and port '5665'
[2020-04-30 10:15:50 +0200] debug/ApiListener: Not connecting to Zone 'kitsune.psychedelicpirate.com' because it's not in the same zone, a parent or a child zone.
[2020-04-30 10:15:50 +0200] debug/ApiListener: Not connecting to Zone 'jake.psychedelicpirate.com' because it's not in the same zone, a parent or a child zone.
[2020-04-30 10:15:50 +0200] information/ApiListener: Reconnecting to endpoint 'striper.psychedelicpirate.com' via host 'striper.psychedelicpirate.com' and port '5665'
[2020-04-30 10:15:50 +0200] information/ApiListener: Reconnecting to endpoint 'kham.psychedelicpirate.com' via host 'kham.psychedelicpirate.com' and port '5665'
[2020-04-30 10:15:50 +0200] debug/ApiListener: Not connecting to Zone 'sally.psychedelicpirate.com' because it's not in the same zone, a parent or a child zone.
[2020-04-30 10:15:50 +0200] information/ApiListener: Reconnecting to endpoint 'shopsatellite.ber.inx.de' via host 'shopsatellite.ber.inx.de' and port '5665'
[2020-04-30 10:15:50 +0200] notice/ApiListener: Current zone master: ic.ops.eusc.inter.net                       
[2020-04-30 10:15:50 +0200] information/ApiListener: Reconnecting to endpoint 'n113h071.cloud.de.inter.net' via host '213.73.113.71' and port '5665'
[2020-04-30 10:15:50 +0200] notice/ApiListener: Connected endpoints:                                             
nanosleep({ 0.200000000 })                       ERR#4 'Interrupted system call'                                 
SIGNAL 20 (SIGCHLD) code=CLD_KILLED pid=7238 uid=183 status=11                                                   
sigprocmask(SIG_SETMASK,{ SIGCHLD },0x0)         = 0 (0x0)                                                       
sigreturn(0x7fffffffca80)                        ERR#4 'Interrupted system call'                                 
wait4(7238,{ SIGNALED,sig=SIGSEGV },WNOHANG,0x0) = 7238 (0x1c46)                                                 
[2020-04-30 10:15:50 +0200] notice/cli: Seemless worker (PID 7238) stopped, stopping as well                     
write(1,"[2020-04-30 10:15:50 +0200] \^[["...,102) = 102 (0x66)                                                  
unlink("/var/run/icinga2/icinga2.pid")           = 0 (0x0)                                                       
close(11)                                        = 0 (0x0)                                                       
_umtx_op(0x8010d0050,UMTX_OP_WAIT_UINT_PRIVATE,0x0,0x0,0x0) = 0 (0x0)                                            
_umtx_op(0x8010d00c8,UMTX_OP_WAIT_UINT_PRIVATE,0x0,0x0,0x0) = 0 (0x0)                                            
_umtx_op(0x8010d0188,UMTX_OP_WAIT_UINT_PRIVATE,0x0,0x0,0x0) = 0 (0x0)                                            
_umtx_op(0x8010d01a0,UMTX_OP_WAIT_UINT_PRIVATE,0x0,0x0,0x0) = 0 (0x0)                                            
_umtx_op(0x8010d01e8,UMTX_OP_WAIT_UINT_PRIVATE,0x0,0x0,0x0) = 0 (0x0)         
...                                   
icinga2 --version
icinga2 - The Icinga 2 network monitoring daemon (version: r2.11.3-1)

Copyright (c) 2012-2020 Icinga GmbH (https://icinga.com/)
License GPLv2+: GNU GPL version 2 or later <http://gnu.org/licenses/gpl2.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

System information:
  Platform: Unknown
  Platform version: Unknown
  Kernel: FreeBSD
  Kernel version: 11.3-RELEASE-p6
  Architecture: amd64

Build information:
  Compiler: Clang 8.0.0
  Build host: ic-11_3-RELEASE-HEAD-job-03

Application information:

General paths:
  Config directory: /usr/local/etc/icinga2
  Data directory: /var/lib/icinga2
  Log directory: /var/log/icinga2
  Cache directory: /var/cache/icinga2
  Spool directory: /var/spool/icinga2
  Run directory: /var/run/icinga2

Old paths (deprecated):
  Installation root: /usr/local
  Sysconf directory: /usr/local/etc
  Run directory (base): /var/run
  Local state directory: /var

Internal paths:
  Package data directory: /usr/local/share/icinga2
  State path: /var/lib/icinga2/icinga2.state
  Modified attributes path: /var/lib/icinga2/modified-attributes.conf
  Objects path: /var/cache/icinga2/icinga2.debug
  Vars path: /var/cache/icinga2/icinga2.vars
  PID path: /var/run/icinga2/icinga2.pid

Do the endpoints run 2.11.3 as well?

no. They run icinga 2.10.5.
According to the compatability list it is master > satellite > client
When I update the satellites to 2.11 and the master is at 2.10 then check-delivery won't work anymore. Thus I had to downgrade my satellites (interestingly clients can be at 2.10) when icinga 2.11 was released and I updated the satellites but suddenly I got no check results anymore...

master (2.11) >= satellite (2.10) >= agent (2.9)
https://icinga.com/docs/icinga2/latest/doc/06-distributed-monitoring/#versions-and-upgrade

As far as I understand @bsdlme correctly, all nodes must run 2.11.3.

As far as I understand @bsdlme correctly, all nodes must run 2.11.3.

Where did he write this? What is the reasoning? I ask because I do not want to blindly run into this test because it means loads of more work if I update everywhere, the crash happens again and I have to downgrade everywhere again.

But we also had a follow-up bug with our big JSON-RPC issue which could have an influence here too. Is there any chance that you have nodes that initiate connections to other nodes, but those nodes don't have endpoint configuration for them?

Refs #7532

But we also had a follow-up bug with our big JSON-RPC issue which could have an influence here too. Is there any chance that you have nodes that initiate connections to other nodes, but those nodes don't have endpoint configuration for them?

I have one master, this master has configurations for its satellites, the satellites have configurations for the master. I have one pair of satellites that actually connect to endpoints with icinga2 instead of nrpe but there all endpoints are configured for the satellites as well.

As far as I understand @bsdlme correctly, all nodes must run 2.11.3.

Where did he write this? What is the reasoning? I ask because I do not want to blindly run into this test because it means loads of more work if I update everywhere, the crash happens again and I have to downgrade everywhere again.

I think that this is strongly related to the JSON-RPC or follow-up bug. So upgrading just your nodes that crash, should be sufficient.

@bsdlme Is there any chance to build some sort of snapshot packages? We could prepare a branch with the 2.11.3 as base and some patches on top.

But the nodes do not crash - the master crashes which I just upgraded to 2.11.3 on start-up (btw. the sattelites are running Linux, the endpoints Linux or Windows)

The master runs FreeBSD. I upgrade icinga from 2.10.5 to 2.11.3 (or before). I restart icinga2 and it crashes on start, apparently when the satellites try to reconnect.

I just read @bsdlme comment and the 'when all nodes run 2.11.3'.
I have to think what I do about that...

I just read @bsdlme comment and the 'when all nodes run 2.11.3'.
I have to think what I do about that...

I don't think that this is necessary. There may still be a problem with the JSON-RPC.

Please could you test this one?

https://github.com/Icinga/icinga2/tree/bugfix/freebsd-7539

How would I do that? Build it from source and installing it over the installed package (which is built on my poudriere)?

Btw. I upgraded now all my endpoints to 2.11.3 and the master still crashes after I updated it to 2.11.3.

Please could you test this one?
https://github.com/Icinga/icinga2/tree/bugfix/freebsd-7539

How would I do that? Build it from source and installing it over the installed package (which is built on my poudriere)?

Exactly.

TBH I relied on what was said in the bug report at https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=245985
But I'm glad that we're moving forward now. @nielsk Do you need any help building the patched version?

yes. I am reading the documentation but I have no clue what to do to build it on FreeBSD @bsdlme

Okay, give a minute...

Just gunzip and put the attached patch from @Al2Klimov's last commit into net-mgmt/icinga2/files/
Then build Icinga as usual.

patch-lib_remote_jsonrpcconnection-heartbeat.cpp.gz

Thanks @bsdlme
I built it succesfully, the crash still persists.

@bsdlme Are you sure that the resulting extension (".cpp") gets picked up by the build... patch... thing?

@Al2Klimov Yes, it got picked up. From the build-log:

=======================<phase: patch-depends  >============================
===========================================================================
=======================<phase: patch          >============================
===>  Patching for icinga2-2.11.3_1
===>  Applying extra patch /distfiles/local-patches/icinga2/patch-lib_remote_jsonrpcconnection-heartbeat.cpp
===>  Applying FreeBSD patches for icinga2-2.11.3_1

Suppose, I've got a fresh FreeBSD v11.3 VM. Could you re-produce this and provide step-by-step instructions how to reproduce this from scratch?

I don't know to be honest since I don't know where it breaks. It seems that it breaks when it tries to connect to the satellites.
This is a configuration with multiple zones, custom checks, satellites etc. It is not like I installed it, added a satellite and it crashed but an installation that runs for years now and got upgraded over time.

I gave the information I have. If someone can point me to what I can further do to provide more information I'd be happy to help.

Please could you generate a core dump of the crash, gzip it and the exact packages you have installed and drop it here?

https://nextcloud.icinga.com/index.php/s/XRFnAsFZDGeEc9H

Whatever I try there is no core dump generated. I can only offer the output of truss -f unfortunately.

Did you try attaching to the top three Icinga processes with a debugger, waiting until it crashes and letting the debugger generate the core dump?

Did you try attaching to the top three Icinga processes with a debugger, waiting until it crashes and letting the debugger generate the core dump?

it happens so fast that I cannot attach a debugger. I try now something else.
A complete truss-output can be downloaded here: https://nextcloud.kobschaetzki.net/index.php/s/CnQLtqo9HX7j4QE

Hm.. does Icinga crash if you lock out the port 5665 via firewall?

And I cannot start it apparently with gdb or lldb because they do not recognize the executable.

Hm.. does Icinga crash if you lock out the port 5665 via firewall?

yes
It only works iirc if I disable the API

Yes, disabling the API and it works. Re-enabling the API and it crashes

I will call it now a day, revert to the working boot environment and go into the weekend (yeah, May 1st)

Please try:

  1. Firewall the traffic
  2. Start Icinga 2
  3. Attach
  4. Unfirewall the traffic
  5. Generate the core dump

So, rebuild my packages with the new 2.11.3.

I updated my icinga2 master it still works just fine as it did before.

Then I upgraded a satellite where it was broken, still broken.

@bsdlme Are you sure that the resulting extension (".cpp") gets picked up by the build... patch... thing?

The framework picks up patch-* files. I know, I wrote that bit. (and rewrote it this afternoon)

I'm about to set up 2 Jails for you to debug, @Al2Klimov
@mat813 @nielsk, anything I need to do to make it crash?

Please don't forget to include the heartbeat/m_Endpoint patch.

Building boost with debug symbols takes some time...

In my case it is a master and there are satellites. API needs to be enabled. The moment the master connects it crashes in my case.

Schöne Grüße

Niels

On 30. Apr 2020, at 17:19, Lars E notifications@github.com wrote:


I'm about to set up 2 Jails for you to debug, @Al2Klimov
@mat813 @nielsk, anything I need to do to make it crash?


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or unsubscribe.

I now have two jails. One Icinga Master jail running 2.11.3 with the patch and one satellite with 2.11.2.
Unfortunately I could not make the master crash, yet.

I now have two jails. One Icinga Master jail running 2.11.3 with the patch and one satellite with 2.11.2.
Unfortunately I could not make the master crash, yet.

the API is configured and activated?

Yes, but maybe in a wrong configuration. I can share it later.

Note that my problem is not master crashing, it is satellites crashing. They are all running on i386.

Note that my problem is not master crashing, it is satellites crashing. They are all running on i386.

That's odd because I crash the master with or without satellites.

Briefly, I was forced to upgrade the master icinga2 machine at my site for various reasons and pitchforks. The timeline looks like this:

  1. Upgrade a satellite to 2.11.2 (not .3). Nothing crashes. Master is running 2.10.5.
  2. Upgrade master to 2.11.2. Master crashes, satellites do not.
  3. Experience frustration, realize it's only an emotion, come back to this thread, find the bugzilla entry at FreeBSD, upgrade port to 2.11.3 with that patch ( for reference https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=245985 )
  4. Build 2.11.3, install on master only. Crash.
  5. Stop icinga2 on all satellites. Try to start master. Crashes with signal 11, no entry in the log file, but a new message:
[2020-05-20 19:12:08 -0700] critical/cli: The daemon could not be started. See log output for details.

So with no satellites running, it crashes out of the box.

This output might be useful to someone:

# /usr/local/sbin/icinga2 --version
icinga2 - The Icinga 2 network monitoring daemon (version: r2.11.3-1)

Copyright (c) 2012-2020 Icinga GmbH (https://icinga.com/)
License GPLv2+: GNU GPL version 2 or later <http://gnu.org/licenses/gpl2.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

System information:
  Platform: Unknown
  Platform version: Unknown
  Kernel: FreeBSD
  Kernel version: 11.3-PRERELEASE
  Architecture: amd64

Build information:
  Compiler: Clang 8.0.0
  Build host: pkg.dream-tech.com
...
# ldd /usr/local/sbin/icinga2/sbin/icinga2
/usr/local/lib/icinga2/sbin/icinga2:
    libexecinfo.so.1 => /usr/local/lib/libexecinfo.so.1 (0x801250000)
    libboost_context.so.1.72.0 => /usr/local/lib/libboost_context.so.1.72.0 (0x80145f000)
    libboost_coroutine.so.1.72.0 => /usr/local/lib/libboost_coroutine.so.1.72.0 (0x801661000)
    libboost_date_time.so.1.72.0 => /usr/local/lib/libboost_date_time.so.1.72.0 (0x801868000)
    libboost_filesystem.so.1.72.0 => /usr/local/lib/libboost_filesystem.so.1.72.0 (0x801a72000)
    libboost_thread.so.1.72.0 => /usr/local/lib/libboost_thread.so.1.72.0 (0x801c8d000)
    libboost_system.so.1.72.0 => /usr/local/lib/libboost_system.so.1.72.0 (0x801ea5000)
    libboost_program_options.so.1.72.0 => /usr/local/lib/libboost_program_options.so.1.72.0 (0x8020a6000)
    libboost_regex.so.1.72.0 => /usr/local/lib/libboost_regex.so.1.72.0 (0x802304000)
    libboost_chrono.so.1.72.0 => /usr/local/lib/libboost_chrono.so.1.72.0 (0x8025b5000)
    libboost_atomic.so.1.72.0 => /usr/local/lib/libboost_atomic.so.1.72.0 (0x8027bd000)
    libssl.so.47 => /usr/local/lib/libssl.so.47 (0x8029bf000)
    libcrypto.so.45 => /usr/local/lib/libcrypto.so.45 (0x802c1b000)
    libedit.so.0 => /usr/local/lib/libedit.so.0 (0x80300d000)
    libncurses.so.8 => /lib/libncurses.so.8 (0x803244000)
    libc++.so.1 => /usr/lib/libc++.so.1 (0x803499000)
    libcxxrt.so.1 => /lib/libcxxrt.so.1 (0x803768000)
    libm.so.5 => /lib/libm.so.5 (0x803987000)
    libgcc_s.so.1 => /lib/libgcc_s.so.1 (0x803bb7000)
    libthr.so.3 => /lib/libthr.so.3 (0x803dca000)
    libc.so.7 => /lib/libc.so.7 (0x803ff2000)
    libicudata.so.66 => /usr/local/lib/libicudata.so.66 (0x8043ad000)
    libicui18n.so.66 => /usr/local/lib/libicui18n.so.66 (0x804600000)
    libicuuc.so.66 => /usr/local/lib/libicuuc.so.66 (0x804b21000)
    librt.so.1 => /usr/lib/librt.so.1 (0x804f19000)
# /usr/local/bin/openssl version
LibreSSL 3.0.2

I don't think LibreSSL is the factor here just by observation, but I could be wrong.

While this particular installation of icinga2 is important to me, it's not really production. So if you want to throw patches at me, please do as I am willing to do almost whatever it takes to get this running again. Thanks in advance.

So, updated to 2.12.0 and it still crashes on startup.

I'm running into an issue under OpenBSD 6.7-stable, which to me looks really similar to what is being described above. Setup is similar:
icinga2-2.11.5v0 (from packages) and API feature enabled and setup as a satalite.

When sending updates from the master it crashes. In my particular case directly after /var/lib/icinga2/api/zones-stage/pub//_etc/generated_dbconfig_hosts.conf is copied into /var/lib/icinga2/api/zones/pub//_etc/generated_dbconfig_hosts.conf.

After creating a build with debug symbols I managed to get the following backtrace:

#0  thrkill () at -:3
#1  0x00000a94dd2ce2ae in _libc_abort () at /usr/src/lib/libc/stdlib/abort.c:61
#2  0x00000a94dd231e9c in _libc_pthread_mutex_unlock (mutexp=<optimized out>) at /usr/src/lib/libc/thread/rthread_mutex.c:265
#3  0x00000a9272acc510 in boost::posix::pthread_mutex_unlock (m=0xa927306f488 <icinga::ApiListener::m_ConfigSyncStageLock>) at /usr/local/include/boost/thread/pthread/mutex.hpp:71
#4  boost::mutex::unlock (this=0xa927306f488 <icinga::ApiListener::m_ConfigSyncStageLock>) at /usr/local/include/boost/thread/pthread/mutex.hpp:125
#5  boost::unique_lock<boost::mutex>::~unique_lock (this=<optimized out>) at /usr/local/include/boost/thread/lock_types.hpp:331
#6  icinga::intrusive_ptr_release<boost::unique_lock<boost::mutex> > (object=0xa952e854d60) at /usr/ports/pobj/icinga2-2.11.5/icinga2-2.11.5/lib/base/shared.hpp:27
#7  boost::intrusive_ptr<icinga::Shared<boost::unique_lock<boost::mutex> > >::~intrusive_ptr (this=<optimized out>) at /usr/local/include/boost/smart_ptr/intrusive_ptr.hpp:98
#8  icinga::ApiListener::AsyncTryActivateZonesStage(std::__1::vector<icinga::String, std::__1::allocator<icinga::String> > const&, boost::intrusive_ptr<icinga::Shared<boost::unique_lock<boost::mutex> > > const&)::$_34::~$_34() (this=0xa949f7244c8)
    at /usr/ports/pobj/icinga2-2.11.5/icinga2-2.11.5/lib/remote/apilistener-filesync.cpp:648
#9  std::__1::__compressed_pair_elem<icinga::ApiListener::AsyncTryActivateZonesStage(std::__1::vector<icinga::String, std::__1::allocator<icinga::String> > const&, boost::intrusive_ptr<icinga::Shared<boost::unique_lock<boost::mutex> > > const&)::$_34, 0, false>::~__compressed_pair_elem() (this=0xa949f7244c8)
    at /usr/include/c++/v1/memory:2134
#10 0x00000a9272acc423 in std::__1::__function::__alloc_func<icinga::ApiListener::AsyncTryActivateZonesStage(std::__1::vector<icinga::String, std::__1::allocator<icinga::String> > const&, boost::intrusive_ptr<icinga::Shared<boost::unique_lock<boost::mutex> > > const&)::$_34, std::__1::allocator<icinga::ApiListener::AsyncTryActivateZonesStage(std::__1::vector<icinga::String, std::__1::allocator<icinga::String> > const&, boost::intrusive_ptr<icinga::Shared<boost::unique_lock<boost::mutex> > > const&)::$_34>, void (icinga::ProcessResult const&)>::destroy() (this=<optimized out>) at /usr/include/c++/v1/functional:1546
#11 std::__1::__function::__func<icinga::ApiListener::AsyncTryActivateZonesStage(std::__1::vector<icinga::String, std::__1::allocator<icinga::String> > const&, boost::intrusive_ptr<icinga::Shared<boost::unique_lock<boost::mutex> > > const&)::$_34, std::__1::allocator<icinga::ApiListener::AsyncTryActivateZonesStage(std::__1::vector<icinga::String, std::__1::allocator<icinga::String> > const&, boost::intrusive_ptr<icinga::Shared<boost::unique_lock<boost::mutex> > > const&)::$_34>, void (icinga::ProcessResult const&)>::destroy_deallocate() (this=0xa949f7244c0) at /usr/include/c++/v1/functional:1643
#12 0x00000a92729bad13 in std::__1::__function::__value_func<void (icinga::ProcessResult const&)>::~__value_func() (this=<optimized out>) at /usr/include/c++/v1/functional:1758
#13 std::__1::function<void (icinga::ProcessResult const&)>::~function() (this=<optimized out>) at /usr/include/c++/v1/functional:2334
#14 std::__1::__bind<std::__1::function<void (icinga::ProcessResult const&)>&, icinga::ProcessResult&>::~__bind() (this=<optimized out>) at /usr/include/c++/v1/functional:2648
#15 std::__1::__compressed_pair_elem<std::__1::__bind<std::__1::function<void (icinga::ProcessResult const&)>&, icinga::ProcessResult&>, 0, false>::~__compressed_pair_elem() (this=<optimized out>) at /usr/include/c++/v1/memory:2134
#16 std::__1::__function::__alloc_func<std::__1::__bind<std::__1::function<void (icinga::ProcessResult const&)>&, icinga::ProcessResult&>, std::__1::allocator<std::__1::__bind<std::__1::function<void (icinga::ProcessResult const&)>&, icinga::ProcessResult&> >, void ()>::destroy() (this=<optimized out>)
    at /usr/include/c++/v1/functional:1546
#17 std::__1::__function::__func<std::__1::__bind<std::__1::function<void (icinga::ProcessResult const&)>&, icinga::ProcessResult&>, std::__1::allocator<std::__1::__bind<std::__1::function<void (icinga::ProcessResult const&)>&, icinga::ProcessResult&> >, void ()>::destroy_deallocate() (this=0xa94b97a1000)
    at /usr/include/c++/v1/functional:1643
#18 0x00000a92729d438d in std::__1::__function::__value_func<void ()>::~__value_func() (this=<optimized out>) at /usr/include/c++/v1/functional:1758
#19 std::__1::function<void ()>::~function() (this=<optimized out>) at /usr/include/c++/v1/functional:2334
#20 bool icinga::ThreadPool::Post<std::__1::function<void ()> >(std::__1::function<void ()>, icinga::SchedulerPolicy)::{lambda()#1}::~SchedulerPolicy() (this=<optimized out>) at /usr/ports/pobj/icinga2-2.11.5/icinga2-2.11.5/lib/base/threadpool.hpp:59
#21 boost::asio::system_executor::dispatch<bool icinga::ThreadPool::Post<std::__1::function<void ()> >(std::__1::function<void ()>, icinga::SchedulerPolicy)::{lambda()#1}, std::__1::allocator<void> >(bool icinga::ThreadPool::Post<std::__1::function<void ()> >(std::__1::function<void ()>, icinga::SchedulerPolicy)::{lambda()#1}&&, std::__1::allocator<void> const&) const (this=<optimized out>, f=<optimized out>) at /usr/local/include/boost/asio/impl/system_executor.hpp:40
#22 0x00000a92729d420c in boost::asio::detail::work_dispatcher<bool icinga::ThreadPool::Post<std::__1::function<void ()> >(std::__1::function<void ()>, icinga::SchedulerPolicy)::{lambda()#1}>::operator()() (this=<optimized out>) at /usr/local/include/boost/asio/detail/work_dispatcher.hpp:58
#23 boost::asio::asio_handler_invoke<boost::asio::detail::work_dispatcher<bool icinga::ThreadPool::Post<std::__1::function<void ()> >(std::__1::function<void ()>, icinga::SchedulerPolicy)::{lambda()#1}> >(boost::asio::detail::work_dispatcher<bool icinga::ThreadPool::Post<std::__1::function<void ()> >(std::__1::function<void ()>, icinga::SchedulerPolicy)::{lambda()#1}>&, ...) (function=...) at /usr/local/include/boost/asio/handler_invoke_hook.hpp:69
#24 boost_asio_handler_invoke_helpers::invoke<boost::asio::detail::work_dispatcher<bool icinga::ThreadPool::Post<std::__1::function<void ()> >(std::__1::function<void ()>, icinga::SchedulerPolicy)::{lambda()#1}>, bool icinga::ThreadPool::Post<std::__1::function<void ()> >(std::__1::function<void ()>, icinga::SchedulerPolicy)::{lambda()#1}>(boost::asio::detail::work_dispatcher<bool icinga::ThreadPool::Post<std::__1::function<void ()> >(std::__1::function<void ()>, icinga::SchedulerPolicy)::{lambda()#1}>&, bool icinga::ThreadPool::Post<std::__1::function<void ()> >(std::__1::function<void ()>, icinga::SchedulerPolicy)::{lambda()#1}&) (function=..., context=...) at /usr/local/include/boost/asio/detail/handler_invoke_helpers.hpp:37
#25 boost::asio::detail::executor_op<boost::asio::detail::work_dispatcher<bool icinga::ThreadPool::Post<std::__1::function<void ()> >(std::__1::function<void ()>, icinga::SchedulerPolicy)::{lambda()#1}>, std::__1::allocator<void>, boost::asio::detail::scheduler_operation>::do_complete(void*, std::__1::allocator<void>*, boost::system::error_code const&, unsigned long) (owner=0xa94be39e300, base=0xa94b97a1e00) at /usr/local/include/boost/asio/detail/executor_op.hpp:70
#26 0x00000a927292c9c8 in boost::asio::detail::scheduler_operation::complete (this=<optimized out>, owner=0xa94be39e300, ec=..., bytes_transferred=<optimized out>) at /usr/local/include/boost/asio/detail/scheduler_operation.hpp:40
#27 boost::asio::detail::scheduler::do_run_one (this=0xa94be39e300, lock=..., this_thread=..., ec=...) at /usr/local/include/boost/asio/detail/impl/scheduler.ipp:401
#28 0x00000a927292c492 in boost::asio::detail::scheduler::run (this=0xa94be39e300, ec=...) at /usr/local/include/boost/asio/detail/impl/scheduler.ipp:154
#29 0x00000a927293d867 in boost::asio::thread_pool::thread_function::operator() (this=<optimized out>) at /usr/local/include/boost/asio/impl/thread_pool.ipp:33
#30 boost::asio::detail::posix_thread::func<boost::asio::thread_pool::thread_function>::run (this=0xa9505e37ac0) at /usr/local/include/boost/asio/detail/posix_thread.hpp:86
#31 0x00000a927293d7a5 in boost::asio::detail::boost_asio_detail_posix_thread_function (arg=0xa9505e37ac0) at /usr/local/include/boost/asio/detail/impl/posix_thread.ipp:74
#32 0x00000a95076c10d1 in _rthread_start (v=<optimized out>) at /usr/src/lib/librthread/rthread.c:96
#33 0x00000a94dd2c6c58 in __tfork_thread () at /usr/src/lib/libc/arch/amd64/sys/tfork_thread.S:77
#34 0x0000000000000000 in ?? ()

Where frame 2 points to the following snippet of code:

int
pthread_mutex_unlock(pthread_mutex_t *mutexp)
{
        pthread_t self = pthread_self();
        pthread_mutex_t mutex;

        if (mutexp == NULL)
                return (EINVAL);

        if (*mutexp == NULL)
#if PTHREAD_MUTEX_DEFAULT == PTHREAD_MUTEX_ERRORCHECK
                return (EPERM);
#elif PTHREAD_MUTEX_DEFAULT == PTHREAD_MUTEX_NORMAL
                return(0);
#else
                abort();
#endif

        mutex = *mutexp;
        _rthread_debug(5, "%p: mutex_unlock %p (%p)\n", self, (void *)mutex,
            (void *)mutex->owner);


        if (mutex->owner != self) {
        _rthread_debug(5, "%p: different owner %p (%p)\n", self, (void *)mutex,
            (void *)mutex->owner);
                if (mutex->type == PTHREAD_MUTEX_ERRORCHECK ||
                    mutex->type == PTHREAD_MUTEX_RECURSIVE) {
                        return (EPERM);
                } else {
                        /*
                         * For mutex type NORMAL our undefined behavior for
                         * unlocking an unlocked mutex is to succeed without
                         * error.  All other undefined behaviors are to
                         * abort() immediately.
                         */
                        if (mutex->owner == NULL &&
                            mutex->type == PTHREAD_MUTEX_NORMAL)
                                return (0);
                        else
                                abort(); /* line causing the crash */

                }
        }

Just for shits and giggles I enabled the threading debugging output and when filtering out the specific mutex I find the following:

# grep -F 0xe0c6dd24f00 /tmp/mutex_debug
0xe0ca5effa40: mutex_lock 0xe0c6dd24f00 (0x0)
0xe0ca5effa40: mutex_unlock 0xe0c6dd24f00 (0xe0ca5effa40)
0xe0be5accc40: mutex_lock 0xe0c6dd24f00 (0x0)
0xe0ca5eff640: mutex_unlock 0xe0c6dd24f00 (0xe0be5accc40)
0xe0ca5eff640: different owner 0xe0c6dd24f00 (0xe0be5accc40)

Which doesn't give a lot of extra information but confirms that the mutex is being unlocked by a thread which is not the one who acquired the lock. Removing the owner check in libc for testing purposes (which I don't recommend anyone to do) icinga keeps on running, which confirms that this issue restricted the wrong thread releasing the mutex.

Since I'm not a C++-programmer, let alone familiar with the boost and icinga paradigms, this is basically where I got stuck, but hopefully this helps someone with more in depth knowledge solve this issue.

confirms that this issue restricted the wrong thread releasing the mutex

Of course! We lock the mutex in one thread and hand it over to another one. And libc doesn't like it? Damn...

Please could you test #8308?

A quick test seems that this fixes my issue. I'm going to leave it running over the weekend and report back somewhere next week

One minor sidenote, which probably doesn't apply to your implementation (I don't know what std::atomic_flag uses under the hood, but probably not the pthread_spin_* family): POSIX states that a pthread_spin_unlock called by a thread not owning the lock results in undefined behaviour[0] and could just as easily cause an abort, similar to what pthread_mutex_unlock does on OpenBSD.

[0] https://pubs.opengroup.org/onlinepubs/9699919799/functions/pthread_spin_unlock.html

After a couple of days it still seems to run as expected.

There's one minor issue where after some time icinga fails to exit when sending it a SIGTERM via:

pkill -T "0" -xf "/usr/local/lib/icinga2/sbin/icinga2 daemon.*"

as per OpenBSD's rc-framework. This however seems not directly related to this diff, since I can restart icinga just after a config-update has been pushed. I'll investigate further and if I find something useful I'll place it on an appropriate ticket.

Is the patch included in the latest 2.12.1 release?

8308? Yes.

Great!
I just updated the FreeBSD port, @nielsk and @mat813 can you please confirm that you setup does work now?

Great!

I just updated the FreeBSD port, @nielsk and @mat813 can you please confirm that you setup does work now?

I will try to find time next week. If it doesn't work m, the rollback can be quite cumbersome.
Thanks a lot already to all who worked on this.

@bsdlme With icinga2-2.12.1, it absolutely still crashes on startup on i386 boxes.

@bsdlme With icinga2-2.12.1, it absolutely still crashes on startup on i386 boxes.

Did you try i386 or x64? I just want to be sure before I do my test.

Well, the answer is in the comment you are responding to, i386. I never had any problems on amd64.

Thanks. I just wanted to be sure because I have seen people using i386 and x64 interchangeably.

@bsdlme I tried to build it today on my poudriere (with a FreeBSD 11.4-jail) and it fails. I created a bug in the FreeBSD-bugzilla.

@nielsk Yes, but you seem to have a local patch that can't be applied correctly.

I could now update -- I had to upgrade to 11.4 because 11.3 is not supported anymore.
@bsdlme icinga2 r2.12.1-1 still crashes (x64)

icinga2 - The Icinga 2 network monitoring daemon (version: r2.12.1-1)

Copyright (c) 2012-2020 Icinga GmbH (https://icinga.com/)
License GPLv2+: GNU GPL version 2 or later <http://gnu.org/licenses/gpl2.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

System information:
  Platform: Unknown
  Platform version: Unknown
  Kernel: FreeBSD
  Kernel version: 11.4-RELEASE-p3
  Architecture: amd64

Build information:
  Compiler: Clang 10.0.0
  Build host: ic-11_4-RELEASE-HEAD-job-01
  OpenSSL version: OpenSSL 1.1.1h  22 Sep 2020

Application information:

General paths:
  Config directory: /usr/local/etc/icinga2
  Data directory: /var/lib/icinga2
  Log directory: /var/log/icinga2
  Cache directory: /var/cache/icinga2
  Spool directory: /var/spool/icinga2
  Run directory: /var/run/icinga2

Old paths (deprecated):
  Installation root: /usr/local
  Sysconf directory: /usr/local/etc
  Run directory (base): /var/run
  Local state directory: /var

Internal paths:
  Package data directory: /usr/local/share/icinga2
  State path: /var/lib/icinga2/icinga2.state
  Modified attributes path: /var/lib/icinga2/modified-attributes.conf
  Objects path: /var/cache/icinga2/icinga2.debug
  Vars path: /var/cache/icinga2/icinga2.vars
  PID path: /var/run/icinga2/icinga2.pid

Too bad. So we're back at the beginning.

Random thought I just had what might be causing this (did not investigate this further, just writing it down so I don't forget): Icinga 2.11 changed the network stack to use Boost.Asio and executes coroutines on multiple worker theads. AFAIK Boost.Asio may schedule these coroutines on arbitrary worker threads, thus if a coroutine holds a mutex while it performs a yield operation, the mutex might be unlocked on a different thread.

Was this page helpful?
0 / 5 - 0 ratings