Icinga2: very large plugin output causes services to look overdue & time out database connection

Created on 16 Aug 2018  路  5Comments  路  Source: Icinga/icinga2

Expected Behavior


The Output of a check plugin should not affect Icinga2's database connection health, which results in all services to look overdue in Icingaweb2. Icinga2 should take care of submitting database-compatible sql queries regarding the max length of table columns.

Current Behavior


As soon as Icinga2 is beginning to handle a very large plugin output in one service, the list of overdue check results in Icingaweb2 gets longer, until all active services seem to be overdue in Icingaweb2. The debuglog shows database exceptions e.g. as follows:

[2018-08-16 10:41:46 +0200] critical/IdoMysqlConnection: Error "Query was empty" when executing query ""
[2018-08-16 10:41:46 +0200] critical/IdoMysqlConnection: Exception during database operation: Verify that your database is operational!



md5-4c6f7665fe9a85eb54b17350286a3fc0



[2018-08-16 10:41:46 +0200] debug/IdoMysqlConnection: Exception during database operation: Error: std::exception



md5-4c6f7665fe9a85eb54b17350286a3fc0



[2018-08-16 10:41:46 +0200] critical/IdoMysqlConnection: Error "MySQL server has gone away" when executing query "DELETE FROM icinga_comments WHERE instance_id = 1 AND session_token <> 1534408869"



md5-b9f5e6bf923699400b699fa3c1b04abd



icinga2 - The Icinga 2 network monitoring daemon (version: r2.9.1-1)

Copyright (c) 2012-2018 Icinga Development Team (https://www.icinga.com/)
License GPLv2+: GNU GPL version 2 or later <http://gnu.org/licenses/gpl2.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

Application information:
  Installation root: /usr
  Sysconf directory: /etc
  Run directory: /run
  Local state directory: /var
  Package data directory: /usr/share/icinga2
  State path: /var/lib/icinga2/icinga2.state
  Modified attributes path: /var/lib/icinga2/modified-attributes.conf
  Objects path: /var/cache/icinga2/icinga2.debug
  Vars path: /var/cache/icinga2/icinga2.vars
  PID path: /run/icinga2/icinga2.pid

System information:
  Platform: Red Hat Enterprise Linux Server
  Platform version: 7.5 (Maipo)
  Kernel: Linux
  Kernel version: 3.10.0-862.9.1.el7.x86_64
  Architecture: x86_64

Build information:
  Compiler: GNU 4.8.5
  Build host: unknown



md5-fe06164964ba722e319a0f6d46a35bf6



Enabled features: api checker command debuglog graphite ido-mysql mainlog notification



md5-8138376b7588fb37cb2aebeea48d388f



[2018-08-16 14:38:08 +0200] information/cli: Icinga application loader (version: r2.9.1-1)
[2018-08-16 14:38:08 +0200] information/cli: Loading configuration file(s).
[2018-08-16 14:38:08 +0200] information/ConfigItem: Committing config item(s).
[2018-08-16 14:38:09 +0200] information/ApiListener: My API identity: fravm007010.os-fra.local
[2018-08-16 14:38:18 +0200] information/WorkQueue: #4 (DaemonUtility::LoadConfigFiles) items: 0, rate: 3.25/s (195/min 195/5min 195/15min);
[2018-08-16 14:38:19 +0200] information/WorkQueue: #5 (IdoMysqlConnection, ido-mysql) items: 0, rate:  0/s (0/min 0/5min 0/15min);
[2018-08-16 14:38:19 +0200] information/WorkQueue: #6 (ApiListener, RelayQueue) items: 0, rate:  0/s (0/min 0/5min 0/15min);
[2018-08-16 14:38:19 +0200] information/WorkQueue: #7 (ApiListener, SyncQueue) items: 0, rate:  0/s (0/min 0/5min 0/15min);
[2018-08-16 14:38:19 +0200] information/WorkQueue: #8 (GraphiteWriter, graphite) items: 0, rate:  0/s (0/min 0/5min 0/15min);
[2018-08-16 14:38:25 +0200] information/ConfigItem: Instantiated 8809 Services.
[2018-08-16 14:38:25 +0200] information/ConfigItem: Instantiated 37 ServiceGroups.
[2018-08-16 14:38:25 +0200] information/ConfigItem: Instantiated 70 HostGroups.
[2018-08-16 14:38:25 +0200] information/ConfigItem: Instantiated 1 EventCommand.
[2018-08-16 14:38:25 +0200] information/ConfigItem: Instantiated 2 FileLoggers.
[2018-08-16 14:38:25 +0200] information/ConfigItem: Instantiated 1 NotificationComponent.
[2018-08-16 14:38:25 +0200] information/ConfigItem: Instantiated 9 NotificationCommands.
[2018-08-16 14:38:25 +0200] information/ConfigItem: Instantiated 9421 Notifications.
[2018-08-16 14:38:25 +0200] information/ConfigItem: Instantiated 1 IcingaApplication.
[2018-08-16 14:38:25 +0200] information/ConfigItem: Instantiated 612 Hosts.
[2018-08-16 14:38:25 +0200] information/ConfigItem: Instantiated 1 ApiListener.
[2018-08-16 14:38:25 +0200] information/ConfigItem: Instantiated 407 Downtimes.
[2018-08-16 14:38:25 +0200] information/ConfigItem: Instantiated 1 GraphiteWriter.
[2018-08-16 14:38:25 +0200] information/ConfigItem: Instantiated 140 Comments.
[2018-08-16 14:38:25 +0200] information/ConfigItem: Instantiated 1 CheckerComponent.
[2018-08-16 14:38:25 +0200] information/ConfigItem: Instantiated 499 Zones.
[2018-08-16 14:38:25 +0200] information/ConfigItem: Instantiated 1 ExternalCommandListener.
[2018-08-16 14:38:25 +0200] information/ConfigItem: Instantiated 501 Endpoints.
[2018-08-16 14:38:25 +0200] information/ConfigItem: Instantiated 7 ApiUsers.
[2018-08-16 14:38:25 +0200] information/ConfigItem: Instantiated 1 UserGroup.
[2018-08-16 14:38:25 +0200] information/ConfigItem: Instantiated 1 IdoMysqlConnection.
[2018-08-16 14:38:25 +0200] information/ConfigItem: Instantiated 298 CheckCommands.
[2018-08-16 14:38:25 +0200] information/ConfigItem: Instantiated 19 TimePeriods.
[2018-08-16 14:38:25 +0200] information/ConfigItem: Instantiated 1 User.
[2018-08-16 14:38:26 +0200] information/ScriptGlobal: Dumping variables to file '/var/cache/icinga2/icinga2.vars'
[2018-08-16 14:38:26 +0200] information/cli: Finished validating the configuration file(s).
aredb-ido bug

All 5 comments

I'm down with cutting off the plugin output, but cutting off the perfdata will lead to problems with other tools trying to read it. Maybe an all or nothing solution is better in this case: If the perfdata is too long, we log that as a Warning and write nothing into the DB.

Sounds good.
I would consider applying this to both columns, long_output & perfdata. I've had ~936553 characters to be saved into long_output, and ~14629 characters to be saved into perfdata. In this case the non-fitting problem applies to long_output.

I would fix the plugin in the first place. 900K characters is nothing you would like to render in Icinga Web 2 nor read in a notification email in case of emergency. I suppose that's a custom written plugin, which can be modified. If you need inspiration for good formatted plugin outputs, here's a talk from a past Icinga Camp: https://www.youtube.com/watch?v=Ey_APqSCoFQ

Thanks for linking this valuable video!
Sure, already did so. My point is to harden Icinga2 against such an impact. It took a while for me to identify the dependency between a crappy plugin output and seemingly system-wide overdue checks in Icingaweb2.

This is something taken into account with the new IcingaDB backend. For the IDO schema, there won't be any efforts to fix this unfortunately.

Was this page helpful?
0 / 5 - 0 ratings