This issue has been migrated from Redmine: https://dev.icinga.com/issues/13907
Created by Nagodar on 2017-01-09 09:36:48 +00:00
Assignee: _(none)_
Status: _New_
Target Version: _2.6.2_
Last Update: _2017-01-16 15:01:46 +00:00 (in Redmine)_
Icinga Version: 2.6.0
Backport?: Not yet backported
Include in Changelog: 1
icinga2 crash report:
Application information:
Application version: r2.6.0-1
Installation root: /usr
Sysconf directory: /etc
Run directory: /run
Local state directory: /var
Package data directory: /usr/share/icinga2
State path: /var/lib/icinga2/icinga2.state
Modified attributes path: /var/lib/icinga2/modified-attributes.conf
Objects path: /var/cache/icinga2/icinga2.debug
Vars path: /var/cache/icinga2/icinga2.vars
PID path: /run/icinga2/icinga2.pid
System information:
Platform: Ubuntu
Platform version: 14.04.5 LTS, Trusty Tahr
Kernel: Linux
Kernel version: 3.13.0-98-generic
Architecture: x86_64
Build information:
Compiler: GNU 4.8.4
Build host: lgw01-29
Stacktrace:
(0) libpthread.so.0:
(1) libc.so.6: gsignal (+0x37) [0x2b0b5bcffc37]
(2) libc.so.6: abort (+0x148) [0x2b0b5bd03028]
(3) libc.so.6:
(4) libc.so.6:
(5) libremote.so.2.6.0:
(6) libremote.so.2.6.0: icinga::HttpServerConnection::ProcessMessageAsync(icinga::HttpRequest&) (+0xa05) [0x2b0b5d95f235]
(7) libbase.so.2.6.0: icinga::WorkQueue::WorkerThreadProc() (+0x483) [0x2b0b5af29653]
(8) libboost_thread.so.1.54.0:
(9) libpthread.so.0:
(10) libc.so.6: clone (+0x6d) [0x2b0b5bdc337d]
Updated by jflach on 2017-01-09 16:03:51 +00:00
That's not much to work with, can you provide a log from the time of the crash? Also please describe what you were trying to do, like deploying new config, restarting icinga 2, just letting it run, running api requests etc.
Updated by Nagodar on 2017-01-09 17:01:05 +00:00
Hello,
icinga2 crashes (more or less) once a day (we have multiple autoscaling groups that register/unregister servers constantly). The last log lines for different crashes:
[2017-01-07 22:05:53 +0100] information/ApiListener: Sending replay log for endpoint 'xxxxxx'.
[2017-01-07 22:05:54 +0100] warning/TlsStream: TLS stream was disconnected.
[2017-01-07 22:05:54 +0100] information/ApiListener: New client connection from [127.0.0.1]:51673 (no client certificate)
[2017-01-07 22:05:54 +0100] information/HttpServerConnection: Request: GET /v1/config/packages (from [127.0.0.1]:51673, user: xxxx)
[2017-01-07 22:05:54 +0100] warning/TlsStream: TLS stream was disconnected.
[2017-01-07 22:05:54 +0100] information/ApiListener: New client connection from [127.0.0.1]:51674 (no client certificate)
[2017-01-07 22:05:54 +0100] information/HttpServerConnection: Request: GET /v1/config/packages (from [127.0.0.1]:51674, user: xxxx)
[2017-01-07 22:05:54 +0100] warning/TlsStream: TLS stream was disconnected.
[2017-01-07 22:05:54 +0100] information/ApiListener: New client connection from [127.0.0.1]:51675 (no client certificate)
[2017-01-07 22:05:54 +0100] information/HttpServerConnection: Request: GET /v1/config/packages (from [127.0.0.1]:51675, user: xxxx)
[2017-01-07 22:05:54 +0100] warning/TlsStream: TLS stream was disconnected.
[2017-01-07 22:05:54 +0100] information/ApiListener: New client connection from [127.0.0.1]:51676 (no client certificate)
[2017-01-07 22:05:54 +0100] information/HttpServerConnection: Request: GET /v1/config/packages (from [127.0.0.1]:51676, user: xxxx)
[2017-01-02 19:02:29 +0100] information/ApiListener: Sending replay log for endpoint 'xxxxxxx'.
[2017-01-02 19:02:29 +0100] information/ApiListener: New client connection from [127.0.0.1]:49479 (no client certificate)
[2017-01-02 19:02:29 +0100] information/HttpServerConnection: Request: GET /v1/config/packages (from [127.0.0.1]:49479, user: xxxx)
[2017-01-02 19:02:30 +0100] warning/TlsStream: TLS stream was disconnected.
[2017-01-02 19:02:30 +0100] information/ApiListener: New client connection from [127.0.0.1]:49480 (no client certificate)
[2017-01-02 19:02:30 +0100] information/HttpServerConnection: Request: GET /v1/config/packages (from [127.0.0.1]:49480, user: xxxx)
[2017-01-02 19:02:30 +0100] warning/TlsStream: TLS stream was disconnected.
[2017-01-02 19:02:30 +0100] information/ApiListener: New client connection from [127.0.0.1]:49481 (no client certificate)
[2017-01-02 19:02:30 +0100] information/HttpServerConnection: Request: GET /v1/config/packages (from [127.0.0.1]:49481, user: xxxx)
[2017-01-02 19:02:30 +0100] warning/TlsStream: TLS stream was disconnected.
[2017-01-02 19:02:30 +0100] information/ApiListener: New client connection from [127.0.0.1]:49482 (no client certificate)
[2017-01-02 19:02:30 +0100] information/HttpServerConnection: Request: GET /v1/config/packages (from [127.0.0.1]:49482, user: xxxx)
[2017-01-02 19:27:02 +0100] information/ApiListener: Sending replay log for endpoint 'xxxxx'.
[2017-01-02 19:27:02 +0100] warning/TlsStream: TLS stream was disconnected.
[2017-01-02 19:27:02 +0100] information/ApiListener: New client connection from [127.0.0.1]:33654 (no client certificate)
[2017-01-02 19:27:02 +0100] information/HttpServerConnection: Request: GET /v1/config/packages (from [127.0.0.1]:33654, user: xxxx)
[2017-01-02 19:27:02 +0100] warning/TlsStream: TLS stream was disconnected.
[2017-01-02 19:27:02 +0100] information/ApiListener: New client connection from [127.0.0.1]:33655 (no client certificate)
[2017-01-02 19:27:02 +0100] information/HttpServerConnection: Request: GET /v1/config/packages (from [127.0.0.1]:33655, user: xxxx)
regards
Updated by denny on 2017-01-10 12:50:29 +00:00
hi,
had the same problem:
# cat report.1483880012.058164
Application information:
Application version: r2.6.0-1
Installation root: /usr
Sysconf directory: /etc
Run directory: /run
Local state directory: /var
Package data directory: /usr/share/icinga2
State path: /var/lib/icinga2/icinga2.state
Modified attributes path: /var/lib/icinga2/modified-attributes.conf
Objects path: /var/cache/icinga2/icinga2.debug
Vars path: /var/cache/icinga2/icinga2.vars
PID path: /run/icinga2/icinga2.pid
System information:
Platform: Debian GNU/Linux
Platform version: 8 (jessie)
Kernel: Linux
Kernel version: 3.16.0-4-amd64
Architecture: x86_64
Build information:
Compiler: GNU 4.9.2
Build host: smithers
Stacktrace:
(0) libpthread.so.0: (+0xf8d0) [0x7fb3966dc8d0]
(1) libc.so.6: gsignal (+0x37) [0x7fb393bea067]
(2) libc.so.6: abort (+0x148) [0x7fb393beb448]
(3) libc.so.6: (+0x2e266) [0x7fb393be3266]
(4) libc.so.6: (+0x2e312) [0x7fb393be3312]
(5) libremote.so.2.6.0: (+0x42553) [0x7fb395a6f553]
(6) libremote.so.2.6.0: icinga::HttpServerConnection::ProcessMessageAsync(icinga::HttpRequest&) (+0xbe5) [0x7fb395aa02c5]
(7) libbase.so.2.6.0: icinga::WorkQueue::WorkerThreadProc() (+0x4eb) [0x7fb3963a279b]
(8) libboost_thread.so.1.55.0: (+0xdaea) [0x7fb397087aea]
(9) libpthread.so.0: (+0x80a4) [0x7fb3966d50a4]
(10) libc.so.6: clone (+0x6d) [0x7fb393c9d62d]
In icinga2.err
icinga2: /usr/include/boost/smart_ptr/intrusive_ptr.hpp:162: T* boost::intrusive_ptr::operator->() const [with T = icinga::Socket]: Assertion `px != 0' failed.
Caught SIGABRT.
Current time: 2017-01-08 13:53:32 +0100
Updated by Nagodar on 2017-01-10 15:43:06 +00:00
My err log file:
icinga2: /usr/include/boost/smart_ptr/intrusive_ptr.hpp:162: T* boost::intrusive_ptr
Caught SIGABRT.
Current time: 2017-01-02 19:02:30 +0100
icinga2: /usr/include/boost/smart_ptr/intrusive_ptr.hpp:162: T* boost::intrusive_ptr
Caught SIGABRT.
Current time: 2017-01-02 19:27:02 +0100
icinga2: /usr/include/boost/smart_ptr/intrusive_ptr.hpp:162: T* boost::intrusive_ptr
Caught SIGABRT.
Current time: 2017-01-07 22:05:54 +0100
Updated by jflach on 2017-01-11 08:49:25 +00:00
So it crashes randomly during API connections... This will be hard to reproduce as I have not had that issue.
Updated by mfriedrich on 2017-01-11 13:19:12 +00:00
Updated by mfriedrich on 2017-01-11 13:19:39 +00:00
Updated by mfriedrich on 2017-01-11 13:29:29 +00:00
Updated by Nagodar on 2017-01-16 15:01:46 +00:00
Can I attach any other worth information?
We'll deal with it right after releasing v2.6.1 which addresses some major issues introduced with 2.6.0.
hi,
maybe it addresses the same bug, but adding it anyway:
My icinga2 crashed again: 2.6.0
[2017-01-19 16:03:02 +0100] warning/Process: Killing process group 24042 ('/usr/lib/nagios/plugins/check_nwc_health' '--authpassword' '' '--authprotocol' 'sha' '--mode' 'interface-usage' '--name' '3' '--privpassword' 'secret' '--privprotocol' 'aes' '--protocol' '3' '--username' 'icinga' '-H' '1.2.3.4') after timeout of 60 seconds
[2017-01-19 16:03:02 +0100] warning/Process: Killing process group 24035 ('/usr/lib/nagios/plugins/check_nwc_health' '--authpassword' '' '--authprotocol' 'sha' '--mode' 'interface-discards' '--name' '3' '--privpassword' 'secret' '--privprotocol' 'aes' '--protocol' '3' '--username' 'icinga' '-H' '1.2.3.4') after timeout of 60 seconds
[2017-01-19 16:03:06 +0100] information/ConfigObject: Dumping program state to file '/var/lib/icinga2/icinga2.state'
[2017-01-19 16:03:06 +0100] information/HttpServerConnection: Request: GET /v1/status/IcingaApplication (from [130.180.66.250]:35834, user: dashing)
[2017-01-19 16:03:06 +0100] warning/Process: PID 24043 was terminated by signal 9 (Killed)
[2017-01-19 16:03:06 +0100] warning/Process: Killing process group 24053 ('/usr/lib/nagios/plugins/check_nwc_health' '--authpassword' '' '--authprotocol' 'sha' '--mode' 'interface-errors' '--name' '2' '--privpassword' 'secret' '--privprotocol' 'aes' '--protocol' '3' '--username' 'icinga' '-H' '1.2.3.4') after timeout of 60 seconds
[2017-01-19 16:03:06 +0100] information/HttpServerConnection: Request: GET /v1/status/IcingaApplication (from [130.180.66.250]:35836, user: dashing)
[2017-01-19 16:03:07 +0100] warning/Process: PID 24053 was terminated by signal 9 (Killed)
[2017-01-19 16:03:07 +0100] information/ApiListener: New client connection from [130.180.66.250]:35866 (no client certificate)
[2017-01-19 16:03:07 +0100] information/IdoMysqlConnection: Query queue items: 5, query rate: 4/s (240/min 1693/5min 5669/15min);
[2017-01-19 16:03:07 +0100] information/HttpServerConnection: Request: GET /v1/status/CIB (from [130.180.66.250]:35866, user: dashing)
[2017-01-19 16:03:07 +0100] warning/ApiListener: Removing API client for endpoint 'henrietta.4lin.net'. 0 API clients left.
[2017-01-19 16:03:07 +0100] information/JsonRpcConnection: No messages for identity 'jhell.org' have been received in the last 60 seconds.
[2017-01-19 16:03:07 +0100] warning/JsonRpcConnection: API client disconnected for identity 'jhell.org'
[2017-01-19 16:03:07 +0100] warning/JsonRpcConnection: API client disconnected for identity 'jhell.org'
[2017-01-19 16:03:07 +0100] warning/ApiListener: Removing API client for endpoint 'jhell.org'. 0 API clients left.
[2017-01-19 16:03:07 +0100] information/ApiListener: New client connection from [130.180.66.250]:35870 (no client certificate)
[2017-01-19 16:03:07 +0100] information/ApiListener: New client connection from [130.180.66.250]:35888 (no client certificate)
[2017-01-19 16:03:07 +0100] information/HttpServerConnection: Request: GET /v1/status/CIB (from [130.180.66.250]:35870, user: dashing)
[2017-01-19 16:03:07 +0100] information/HttpServerConnection: Request: GET /v1/status/CIB (from [130.180.66.250]:35888, user: dashing)
[2017-01-19 16:03:07 +0100] information/ApiListener: New client connection from [130.180.66.250]:35877 (no client certificate)
[2017-01-19 16:03:07 +0100] warning/Process: PID 24036 was terminated by signal 9 (Killed)
[2017-01-19 16:03:07 +0100] warning/Process: Killing process group 24037 ('/usr/lib/nagios/plugins/check_nwc_health' '--authpassword' '' '--authprotocol' 'sha' '--mode' 'interface-status' '--name' '2' '--privpassword' 'secret' '--privprotocol' 'aes' '--protocol' '3' '--username' 'icinga' '-H' '1.2.3.5') after timeout of 60 seconds
[2017-01-19 16:03:07 +0100] critical/HttpServerConnection: Unhandled exception while processing Http request: Error: Tried to read/write from a closed socket.
[2017-01-19 16:03:07 +0100] critical/HttpServerConnection: Unhandled exception while processing Http request: Error: Tried to read/write from a closed socket.
[2017-01-19 16:03:07 +0100] critical/HttpServerConnection: Unhandled exception while processing Http request: Error: Tried to read/write from a closed socket.
[2017-01-19 16:03:07 +0100] critical/HttpServerConnection: Unhandled exception while processing Http request: Error: Tried to read/write from a closed socket.
root@mon3:/var/log/icinga2# grep '16:03' icinga2.log -A 50
[2017-01-19 16:03:02 +0100] warning/Process: Killing process group 24042 ('/usr/lib/nagios/plugins/check_nwc_health' '--authpassword' '' '--authprotocol' 'sha' '--mode' 'interface-usage' '--name' '3' '--privpassword' 'secret' '--privprotocol' 'aes' '--protocol' '3' '--username' 'icinga' '-H' '1.2.3.4') after timeout of 60 seconds
[2017-01-19 16:03:02 +0100] warning/Process: Killing process group 24035 ('/usr/lib/nagios/plugins/check_nwc_health' '--authpassword' '' '--authprotocol' 'sha' '--mode' 'interface-discards' '--name' '3' '--privpassword' 'secret' '--privprotocol' 'aes' '--protocol' '3' '--username' 'icinga' '-H' '1.2.3.4') after timeout of 60 seconds
[2017-01-19 16:03:06 +0100] information/ConfigObject: Dumping program state to file '/var/lib/icinga2/icinga2.state'
[2017-01-19 16:03:06 +0100] information/HttpServerConnection: Request: GET /v1/status/IcingaApplication (from [130.180.66.250]:35834, user: dashing)
[2017-01-19 16:03:06 +0100] warning/Process: PID 24043 was terminated by signal 9 (Killed)
[2017-01-19 16:03:06 +0100] warning/Process: Killing process group 24053 ('/usr/lib/nagios/plugins/check_nwc_health' '--authpassword' '' '--authprotocol' 'sha' '--mode' 'interface-errors' '--name' '2' '--privpassword' 'secret' '--privprotocol' 'aes' '--protocol' '3' '--username' 'icinga' '-H' '1.2.3.4') after timeout of 60 seconds
[2017-01-19 16:03:06 +0100] information/HttpServerConnection: Request: GET /v1/status/IcingaApplication (from [130.180.66.250]:35836, user: dashing)
[2017-01-19 16:03:07 +0100] warning/Process: PID 24053 was terminated by signal 9 (Killed)
[2017-01-19 16:03:07 +0100] information/ApiListener: New client connection from [130.180.66.250]:35866 (no client certificate)
[2017-01-19 16:03:07 +0100] information/IdoMysqlConnection: Query queue items: 5, query rate: 4/s (240/min 1693/5min 5669/15min);
[2017-01-19 16:03:07 +0100] information/HttpServerConnection: Request: GET /v1/status/CIB (from [130.180.66.250]:35866, user: dashing)
[2017-01-19 16:03:07 +0100] warning/ApiListener: Removing API client for endpoint 'henrietta.4lin.net'. 0 API clients left.
[2017-01-19 16:03:07 +0100] information/JsonRpcConnection: No messages for identity 'jhell.org' have been received in the last 60 seconds.
[2017-01-19 16:03:07 +0100] warning/JsonRpcConnection: API client disconnected for identity 'jhell.org'
[2017-01-19 16:03:07 +0100] warning/JsonRpcConnection: API client disconnected for identity 'jhell.org'
[2017-01-19 16:03:07 +0100] warning/ApiListener: Removing API client for endpoint 'jhell.org'. 0 API clients left.
[2017-01-19 16:03:07 +0100] information/ApiListener: New client connection from [130.180.66.250]:35870 (no client certificate)
[2017-01-19 16:03:07 +0100] information/ApiListener: New client connection from [130.180.66.250]:35888 (no client certificate)
[2017-01-19 16:03:07 +0100] information/HttpServerConnection: Request: GET /v1/status/CIB (from [130.180.66.250]:35870, user: dashing)
[2017-01-19 16:03:07 +0100] information/HttpServerConnection: Request: GET /v1/status/CIB (from [130.180.66.250]:35888, user: dashing)
[2017-01-19 16:03:07 +0100] information/ApiListener: New client connection from [130.180.66.250]:35877 (no client certificate)
[2017-01-19 16:03:07 +0100] warning/Process: PID 24036 was terminated by signal 9 (Killed)
[2017-01-19 16:03:07 +0100] warning/Process: Killing process group 24037 ('/usr/lib/nagios/plugins/check_nwc_health' '--authpassword' '' '--authprotocol' 'sha' '--mode' 'interface-status' '--name' '2' '--privpassword' 'secret' '--privprotocol' 'aes' '--protocol' '3' '--username' 'icinga' '-H' '1.2.3.5') after timeout of 60 seconds
[2017-01-19 16:03:07 +0100] critical/HttpServerConnection: Unhandled exception while processing Http request: Error: Tried to read/write from a closed socket.
(0) libbase.so.2.6.0: <unknown function> (+0xdb0af) [0x2b24490ef0af]
(1) libbase.so.2.6.0: <unknown function> (+0xdb147) [0x2b24490ef147]
(2) libbase.so.2.6.0: icinga::SocketEventEngineEpoll::ChangeEvents(icinga::SocketEvents*, int) (+0x164) [0x2b2449073164]
(3) libbase.so.2.6.0: icinga::TlsStream::Write(void const*, unsigned long) (+0x65) [0x2b2449072965]
(4) libremote.so.2.6.0: icinga::HttpResponse::SetStatus(int, icinga::String const&) (+0x223) [0x2b24499ab153]
(5) libremote.so.2.6.0: icinga::StatusHandler::HandleRequest(boost::intrusive_ptr<icinga::ApiUser> const&, icinga::HttpRequest&, icinga::HttpResponse&, boost::intrusive_ptr<icinga::Dictionary> const&) (+0x59a) [0x2b24499cacca]
(6) libremote.so.2.6.0: icinga::HttpHandler::ProcessRequest(boost::intrusive_ptr<icinga::ApiUser> const&, icinga::HttpRequest&, icinga::HttpResponse&) (+0x5c2) [0x2b24499abcf2]
(7) libremote.so.2.6.0: icinga::HttpServerConnection::ProcessMessageAsync(icinga::HttpRequest&) (+0x4f6) [0x2b24499aebd6]
(8) libbase.so.2.6.0: icinga::WorkQueue::WorkerThreadProc() (+0x4eb) [0x2b244909179b]
(9) libboost_thread.so.1.55.0: <unknown function> (+0xdaea) [0x2b2448459aea]
(10) libpthread.so.0: <unknown function> (+0x80a4) [0x2b2448dff0a4]
(11) libc.so.6: clone (+0x6d) [0x2b244b86962d]
[2017-01-19 16:03:07 +0100] critical/HttpServerConnection: Unhandled exception while processing Http request: Error: Tried to read/write from a closed socket.
(0) libbase.so.2.6.0: <unknown function> (+0xdb0af) [0x2b24490ef0af]
(1) libbase.so.2.6.0: <unknown function> (+0xdb147) [0x2b24490ef147]
(2) libbase.so.2.6.0: icinga::SocketEventEngineEpoll::ChangeEvents(icinga::SocketEvents*, int) (+0x164) [0x2b2449073164]
(3) libbase.so.2.6.0: icinga::TlsStream::Write(void const*, unsigned long) (+0x65) [0x2b2449072965]
(4) libremote.so.2.6.0: icinga::HttpResponse::SetStatus(int, icinga::String const&) (+0x223) [0x2b24499ab153]
(5) libremote.so.2.6.0: icinga::StatusHandler::HandleRequest(boost::intrusive_ptr<icinga::ApiUser> const&, icinga::HttpRequest&, icinga::HttpResponse&, boost::intrusive_ptr<icinga::Dictionary> const&) (+0x59a) [0x2b24499cacca]
(6) libremote.so.2.6.0: icinga::HttpHandler::ProcessRequest(boost::intrusive_ptr<icinga::ApiUser> const&, icinga::HttpRequest&, icinga::HttpResponse&) (+0x5c2) [0x2b24499abcf2]
(7) libremote.so.2.6.0: icinga::HttpServerConnection::ProcessMessageAsync(icinga::HttpRequest&) (+0x4f6) [0x2b24499aebd6]
(8) libbase.so.2.6.0: icinga::WorkQueue::WorkerThreadProc() (+0x4eb) [0x2b244909179b]
(9) libboost_thread.so.1.55.0: <unknown function> (+0xdaea) [0x2b2448459aea]
(10) libpthread.so.0: <unknown function> (+0x80a4) [0x2b2448dff0a4]
(11) libc.so.6: clone (+0x6d) [0x2b244b86962d]
[2017-01-19 16:03:07 +0100] critical/HttpServerConnection: Unhandled exception while processing Http request: Error: Tried to read/write from a closed socket.
(0) libbase.so.2.6.0: <unknown function> (+0xdb0af) [0x2b24490ef0af]
(1) libbase.so.2.6.0: <unknown function> (+0xdb147) [0x2b24490ef147]
(2) libbase.so.2.6.0: icinga::SocketEventEngineEpoll::ChangeEvents(icinga::SocketEvents*, int) (+0x164) [0x2b2449073164]
(3) libbase.so.2.6.0: icinga::TlsStream::Write(void const*, unsigned long) (+0x65) [0x2b2449072965]
(4) libremote.so.2.6.0: icinga::HttpResponse::SetStatus(int, icinga::String const&) (+0x223) [0x2b24499ab153]
(5) libremote.so.2.6.0: icinga::StatusHandler::HandleRequest(boost::intrusive_ptr<icinga::ApiUser> const&, icinga::HttpRequest&, icinga::HttpResponse&, boost::intrusive_ptr<icinga::Dictionary> const&) (+0x59a) [0x2b24499cacca]
(6) libremote.so.2.6.0: icinga::HttpHandler::ProcessRequest(boost::intrusive_ptr<icinga::ApiUser> const&, icinga::HttpRequest&, icinga::HttpResponse&) (+0x5c2) [0x2b24499abcf2]
(7) libremote.so.2.6.0: icinga::HttpServerConnection::ProcessMessageAsync(icinga::HttpRequest&) (+0x4f6) [0x2b24499aebd6]
(8) libbase.so.2.6.0: icinga::WorkQueue::WorkerThreadProc() (+0x4eb) [0x2b244909179b]
(9) libboost_thread.so.1.55.0: <unknown function> (+0xdaea) [0x2b2448459aea]
(10) libpthread.so.0: <unknown function> (+0x80a4) [0x2b2448dff0a4]
(11) libc.so.6: clone (+0x6d) [0x2b244b86962d]
[2017-01-19 16:03:07 +0100] critical/HttpServerConnection: Unhandled exception while processing Http request: Error: Tried to read/write from a closed socket.
(0) libbase.so.2.6.0: <unknown function> (+0xdb0af) [0x2b24490ef0af]
(1) libbase.so.2.6.0: <unknown function> (+0xdb147) [0x2b24490ef147]
(2) libbase.so.2.6.0: icinga::SocketEventEngineEpoll::ChangeEvents(icinga::SocketEvents*, int) (+0x164) [0x2b2449073164]
(3) libbase.so.2.6.0: icinga::TlsStream::Write(void const*, unsigned long) (+0x65) [0x2b2449072965]
(4) libremote.so.2.6.0: icinga::HttpResponse::SetStatus(int, icinga::String const&) (+0x223) [0x2b24499ab153]
(5) libremote.so.2.6.0: icinga::StatusHandler::HandleRequest(boost::intrusive_ptr<icinga::ApiUser> const&, icinga::HttpRequest&, icinga::HttpResponse&, boost::intrusive_ptr<icinga::Dictionary> const&) (+0x59a) [0x2b24499cacca]
(6) libremote.so.2.6.0: icinga::HttpHandler::ProcessRequest(boost::intrusive_ptr<icinga::ApiUser> const&, icinga::HttpRequest&, icinga::HttpResponse&) (+0x5c2) [0x2b24499abcf2]
(7) libremote.so.2.6.0: icinga::HttpServerConnection::ProcessMessageAsync(icinga::HttpRequest&) (+0x4f6) [0x2b24499aebd6]
(8) libbase.so.2.6.0: icinga::WorkQueue::WorkerThreadProc() (+0x4eb) [0x2b244909179b]
(9) libboost_thread.so.1.55.0: <unknown function> (+0xdaea) [0x2b2448459aea]
(10) libpthread.so.0: <unknown function> (+0x80a4) [0x2b2448dff0a4]
(11) libc.so.6: clone (+0x6d) [0x2b244b86962d]
Could be related: #5148
Guessing the <unknown function>s are locks;
Could you try #5416 it fixed my problem with HTTP requests to the API
In #5419 are even more lock fixes for the API
Hi,
I've got same error in logs on ubuntu 14.04 and icinga 2.8.0-187
I'm able to reproduce it by setting short downtimes(about 2 sec) in loop.
icinga error log is empty.
Configuration is 2 masters(in zone) and 4 satellites (2 per zone)
icinga2 - The Icinga 2 network monitoring daemon (version: v2.8.0-187-g025abc3)
Copyright (c) 2012-2017 Icinga Development Team (https://www.icinga.com/)
License GPLv2+: GNU GPL version 2 or later <http://gnu.org/licenses/gpl2.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Application information:
Installation root: /usr
Sysconf directory: /etc
Run directory: /run
Local state directory: /var
Package data directory: /usr/share/icinga2
State path: /var/lib/icinga2/icinga2.state
Modified attributes path: /var/lib/icinga2/modified-attributes.conf
Objects path: /var/cache/icinga2/icinga2.debug
Vars path: /var/cache/icinga2/icinga2.vars
PID path: /run/icinga2/icinga2.pid
System information:
Platform: Ubuntu
Platform version: 14.04.1 LTS, Trusty Tahr
Kernel: Linux
Kernel version: 3.13.0-45-generic
Architecture: x86_64
Build information:
Compiler: GNU 4.8.2
Build host: 024380082aea
[2018-01-17 16:40:26 +0000] warning/TlsStream: TLS stream was disconnected.
[2018-01-17 16:40:26 +0000] warning/TlsStream: TLS stream was disconnected.
[2018-01-17 16:40:26 +0000] information/ConfigItem: Committing config item(s).
[2018-01-17 16:40:26 +0000] warning/ApplyRule: Apply rule 'icinga-cluster' (in /etc/icinga2/conf.d/services-basic.conf: 1:0-1:29) for type 'Service' does not match anywhere!
[2018-01-17 16:40:26 +0000] warning/ApplyRule: Apply rule 'icinga-zone-caas' (in /etc/icinga2/conf.d/services-basic.conf: 8:1-8:32) for type 'Service' does not match anywhere!
[2018-01-17 16:40:26 +0000] information/ConfigItem: Instantiated 1 Downtime.
[2018-01-17 16:40:26 +0000] information/ConfigItem: Triggering Start signal for config items
[2018-01-17 16:40:26 +0000] information/ConfigItem: Activated all objects.
[2018-01-17 16:40:26 +0000] critical/HttpServerConnection: Unhandled exception while processing Http request: Error: Tried to read/write from a closed socket.
--
聽
(0) libbase.so.2.8.0: <unknown function> (+0xeae9f) [0x2b1611f5de9f]
(1) libbase.so.2.8.0: <unknown function> (+0xeaf29) [0x2b1611f5df29]
(2) libbase.so.2.8.0: icinga::SocketEventEngineEpoll::ChangeEvents(icinga::SocketEvents*, int) (+0x147) [0x2b1611ee3267]
(3) libbase.so.2.8.0: icinga::TlsStream::Write(void const*, unsigned long) (+0x55) [0x2b1611eddf35]
(4) libremote.so.2.8.0: icinga::HttpResponse::SetStatus(int, icinga::String const&) (+0x1e5) [0x2b1614942aa5]
(5) libremote.so.2.8.0: icinga::ActionsHandler::HandleRequest(boost::intrusive_ptr<icinga::ApiUser> const&, icinga::HttpRequest&, icinga::HttpRespo
(6) libremote.so.2.8.0: icinga::HttpHandler::ProcessRequest(boost::intrusive_ptr<icinga::ApiUser> const&, icinga::HttpRequest&, icinga::HttpRespons
(7) libremote.so.2.8.0: icinga::HttpServerConnection::ProcessMessageAsync(icinga::HttpRequest&) (+0x747) [0x2b1614945967]
(8) libbase.so.2.8.0: icinga::WorkQueue::WorkerThreadProc() (+0x4ed) [0x2b1611ef767d]
(9) libboost_thread.so.1.54.0: <unknown function> (+0xba4a) [0x2b1613265a4a]
(10) libpthread.so.0: <unknown function> (+0x8182) [0x2b1611c5d182]
(11) libc.so.6: clone (+0x6d) [0x2b1612d8b00d]
@n0npax Could you provide a minimal configuration example/script for those downtimes?
@Crunsher , of course! Here additionals info and script:
Anyway, I just realized that haproxy may mess with traffic. Even if that's true, there's still:
critical/HttpServerConnection: Unhandled exception while processing Http request: Error: Tried to read/write from a closed socket
I'm not sure if You're still interested in following configuration.
Also sorry for no information about haproxy in previous comment.
please run 'pip install python-icinga2api' before running script and fill fields with _TODO_
Generating issue may take up to few hours, so just let this work during night or whatever.
Script:
#!/usr/bin/env python
import time
from icinga2api.client import Client
server_name = 'localhost' # haproxy to icinga-masters (roundrobin)
api_user = 'user' # TODO FILL
api_password = 'pass' # TODO FILL
api_port = 5665
mon_host = "https://{}:{}".format(server_name, api_port)
host = 'my_hostname.domain.net' # TODO FILL -> hostname which will be used for service downtime
service = 'service-name' # TODO FILL -> service name
client = Client(mon_host, api_user, api_password,
ca_certificate='/etc/icinga2/ca.crt')
def schedule_service_downtime(hostname, service_name, comment, duration,
start_time=None, filter=None, filter_vars=None,
fixed=None, trigger_name=None):
if not filter:
filter = 'match("{}", host.name) && match("{}", service.name)'.format(
hostname, service_name)
return schedule_downtime('Service', filter, comment, duration, start_time,
filter_vars, fixed, trigger_name)
def schedule_downtime(object_type, filter, comment, duration, start_time=None,
filter_vars=None, fixed=None, trigger_name=None):
if not start_time:
start_time = time.time()
end_time = start_time + duration
user = api_user
comment = "{}:{}".format(user, comment)
results = client.actions.schedule_downtime(
object_type, filter, user,
comment, start_time, end_time, duration, filter_vars=filter_vars,
fixed=fixed, trigger_name=trigger_name)
for result in results['results']:
if "Successfully scheduled downtime" not in result['status']:
raise Exception
return results
while True:
try:
schedule_service_downtime(host, service, "lets trigger the issue", 1)
except Exception as e:
print 'one of the issues generated'
print e
finally:
time.sleep(2)
So, I've got 4 satellites and 2 masters with classic configuration:
zone.conf
# master
object Endpoint "icinga-master-dev72.my_domain" {
host = "icinga-master-dev72.my_domain"
port = "5665"
}
object Endpoint "icinga-master-dev73.my_domain" {
host = "icinga-master-dev73.my_domain"
port = "5665"
}
object Zone "master" {
endpoints = ["icinga-master-dev72.my_domain", "icinga-master-dev73.my_domain"]
}
# glob
object Zone "global-templates" {
global = true
}
# sat 1
object Zone "sat1" {
endpoints = ["icinga-satellite-dev86.my_domain", "icinga-satellite-dev87.my_domain"]
parent = "master"
}
object Endpoint "icinga-satellite-dev86.my_domain" {
host = "icinga-satellite-dev86.my_domain"
port = "5665"
}
object Endpoint "icinga-satellite-dev87.my_domain" {
host = "icinga-satellite-dev87.my_domain"
port = "5665"
}
# sat 2
object Zone "sat2" {
endpoints = ["icinga-satellite-dev88.my_domain", "icinga-satellite-dev89.my_domain"]
parent = "master"
}
object Endpoint "icinga-satellite-dev88.my_domain" {
host = "icinga-satellite-dev88.my_domain"
port = "5665"
}
object Endpoint "icinga-satellite-dev89.my_domain" {
host = "icinga-satellite-dev89.my_domain"
port = "5665"
}
icinga-masters are available via haproxy in round-robin mode.
haproxy is used just on other host (without icinga)
Part of haproxy.conf:
global
chroot /var/lib/haproxy
daemon
group haproxy
log /dev/log local0
maxconn 8000
pidfile /var/run/haproxy.pid
stats socket /var/lib/haproxy/stats
user haproxy
defaults
log global
maxconn 8000
option redispatch
retries 3
timeout http-request 10s
timeout queue 1m
timeout connect 10s
timeout client 1m
timeout server 1m
timeout check 10s
#### other services...
listen monitoring
bind 127.0.0.1:5665
balance roundrobin
log-format [%t]\ %f\ SRC:%ci:%cp\ ->\ (FE:%fi:%fp)\ ->\ (BE:%bi:%bp)\ ->\ DST:%si:%sp
option tcp-check
option logasap
timeout client 60
timeout server 60
server 149.202.176.75 149.202.176.75:5665
server 149.202.176.93 149.202.176.93:5665
Likely related to #6361.
A similar behaviour has been analysed and fixed with 2.9. Please test the new release.
Most helpful comment
We'll deal with it right after releasing v2.6.1 which addresses some major issues introduced with 2.6.0.