Telegraf: MySQL - commands out of sync?

Created on 26 Jan 2020 · 5Comments · Source: influxdata/telegraf

Relevant telegraf.conf:

# Configuration for telegraf agent
[agent]
  ## Default data collection interval for all inputs
  interval = "10s"
  ## Rounds collection interval to 'interval'
  ## ie, if interval="10s" then always collect on :00, :10, :20, etc.
  round_interval = true

  ## Telegraf will send metrics to outputs in batches of at most
  ## metric_batch_size metrics.
  ## This controls the size of writes that Telegraf sends to output plugins.
  metric_batch_size = 1000

  ## For failed writes, telegraf will cache metric_buffer_limit metrics for each
  ## output, and will flush this buffer on a successful write. Oldest metrics
  ## are dropped first when this buffer fills.
  ## This buffer only fills when writes fail to output plugin(s).
  metric_buffer_limit = 10000

  ## Collection jitter is used to jitter the collection by a random amount.
  ## Each plugin will sleep for a random time within jitter before collecting.
  ## This can be used to avoid many plugins querying things like sysfs at the
  ## same time, which can have a measurable effect on the system.
  collection_jitter = "0s"

  ## Default flushing interval for all outputs. Maximum flush_interval will be
  ## flush_interval + flush_jitter
  flush_interval = "10s"
  ## Jitter the flush interval by a random amount. This is primarily to avoid
  ## large write spikes for users running a large number of telegraf instances.
  ## ie, a jitter of 5s and interval 10s means flushes will happen every 10-15s
  flush_jitter = "0s"

  ## By default or when set to "0s", precision will be set to the same
  ## timestamp order as the collection interval, with the maximum being 1s.
  ##   ie, when interval = "10s", precision will be "1s"
  ##       when interval = "250ms", precision will be "1ms"
  ## Precision will NOT be used for service inputs. It is up to each individual
  ## service input to set the timestamp at the appropriate precision.
  ## Valid time units are "ns", "us" (or "µs"), "ms", "s".
  precision = ""

  ## Logging configuration:
  ## Run telegraf with debug log messages.
  debug = false
  ## Run telegraf in quiet mode (error log messages only).
  quiet = false
  ## Specify the log file name. The empty string means to log to stderr.
  logfile = ""

  ## Override default hostname, if empty use os.Hostname()
  hostname = ""
  ## If set to true, do no set the "host" tag in the telegraf agent.
  omit_hostname = false
[[outputs.influxdb_v2]]
  ## The URLs of the InfluxDB cluster nodes.
  ##
  ## Multiple URLs can be specified for a single cluster, only ONE of the
  ## urls will be written to each interval.
  ## urls exp: http://127.0.0.1:9999
  urls = ["http://172.16.0.2:9999"]

  ## Token for authentication.
  token = "$INFLUX_TOKEN"

  ## Organization is the name of the organization you wish to write to; must exist.
  organization = "<REDACT>"

  ## Destination bucket to write into.
  bucket = "<REDACT>"
[[inputs.cpu]]
  ## Whether to report per-cpu stats or not
  percpu = true
  ## Whether to report total system cpu stats or not
  totalcpu = true
  ## If true, collect raw CPU time metrics.
  collect_cpu_time = false
  ## If true, compute and report the sum of all non-idle CPU states.
  report_active = false
[[inputs.disk]]
  ## By default stats will be gathered for all mount points.
  ## Set mount_points will restrict the stats to only the specified mount points.
  # mount_points = ["/"]
  ## Ignore mount points by filesystem type.
  ignore_fs = ["tmpfs", "devtmpfs", "devfs", "overlay", "aufs", "squashfs"]
[[inputs.diskio]]
[[inputs.mem]]
[[inputs.net]]
[[inputs.processes]]
[[inputs.swap]]
[[inputs.system]]
[[inputs.mysql]]
  servers = ["telegraf@tcp(localhost:3306)/"]
  metric_version = 2
  gather_table_schema = true
  gather_process_list = true
  gather_user_statistics = true
  gather_info_schema_auto_inc = true
  gather_innodb_metrics = true
  gather_slave_status = true
  gather_binary_logs = false
  gather_table_io_waits = true
  gather_table_lock_waits = true
  gather_index_io_waits = true
  gather_event_waits = true
  gather_file_events_stats = true
  gather_perf_events_statements = true
  interval_slow = "30m"

System info:

CentOS Linux release 7.7.1908 (Core)
Telegraf 1.13.2 (git: HEAD 6dad859d)
mysql Ver 15.1 Distrib 10.4.11-MariaDB, for Linux (x86_64) using readline 5.1

Steps to reproduce:

yum install https://dl.influxdata.com/telegraf/releases/telegraf-1.13.2-1.x86_64.rpm
Install config from InfluxDB 2.0
Use a existing mysql user or create a new one with PROCESS, SUPER, REPLICATION CLIENT and SELECT on mysql
systemctl enable --now telegraf

Expected behavior:

Metrics in InfluxDB

Actual behavior:

telegraf[13621]: 2020-01-26T18:49:40Z E! [inputs.mysql] Error in plugin: commands out of sync. Did you run multiple statements at once?

Additional info:

aremysql bug ready

Source

MarcHagen

All 5 comments

I've never seen this one, will need to look into it.

danielnelson on 28 Jan 2020

I had to turn on mysql-error.log for some debugging of other service, and i saw these line:

2020-04-15 18:33:30 95032 [Warning] Aborted connection 95032 to db: 'unconnected' user: 'unauthenticated' host: '127.0.0.1' (This connection closed normally without authentication)
2020-04-15 18:33:40 95039 [Warning] Aborted connection 95039 to db: 'unconnected' user: 'unauthenticated' host: '127.0.0.1' (This connection closed normally without authentication)
2020-04-15 18:33:50 95041 [Warning] Aborted connection 95041 to db: 'unconnected' user: 'unauthenticated' host: '127.0.0.1' (This connection closed normally without authentication)
2

And when removing the inputs.mysql from the telegraf.conf it stopped (is also the only service that is logging in from localhost...)

CentOS Linux release 7.7.1908 (Core)
Telegraf 1.13.2 (git: HEAD 6dad859d)
mariadb Ver 15.1 Distrib 10.4.12-MariaDB, for Linux (x86_64) using readline 5.1

MarcHagen on 15 Apr 2020

👀1

Looks like this might be your issue https://github.com/go-sql-driver/mysql/issues/1038

ssoroka on 17 Jun 2020

👀1

So i have tested it. And it works. But i'm not quite sure if its the command (because i didn't test before it ... doh). Or maybe a MariaDB update.

Currently on :
CentOS Linux release 7.8.2003 (Core)
Telegraf 1.13.2 (git: HEAD 6dad859)
mariadb Ver 15.1 Distrib 10.4.13-MariaDB, for Linux (x86_64) using readline 5.1

I also saw i didn't add a SELECT permission on performance_schema. So that is also something to keep in mind.

There changed alot in the meantime, vm got moved to another cluster also. Shouldn't affect anything.

But im glad it works now :)

MarcHagen on 18 Jun 2020

Great, let us know if you have any more issues. I'm going to close this out for now.

ssoroka on 18 Jun 2020

Was this page helpful?

0 / 5 - 0 ratings

Related issues

Memory leak when using Telegraf 1.15. to monitor Azure SQL databases

grafanaUser123 · 3Comments

[inputs.exec] Python command ModuleNotFoundError

robert-gomes · 3Comments

How do I select from InfluxDB what hosts are in the DB?

mabushey · 3Comments

[kubernetes plugin] Volume are not filtered by storage type

nsteinmetz · 3Comments

Logparser plugin don't process new lines after telegraf configuration reload.

Isonami · 3Comments