Influxdb: InfluxDB 1.5.0 not listening to http after starting the process

Created on 12 Mar 2018  路  25Comments  路  Source: influxdata/influxdb

Bug report

__System info:__ InfluxDB 1.5.0, Ubuntu 16.04.4 (tested with the same data in ArchLinux also)

__Steps to reproduce:__

  1. Start InfluxDB service
  2. Try to connect to the http service or check bound ports (8086)

__Expected behavior:__ 8086 port listening

__Actual behavior:__ No open port, no http service

__Additional info:__ systemd log:

Mar 12 09:10:03 userver systemd[1]: Started InfluxDB is an open-source, distributed, time series database.
Mar 12 09:10:03 userver influxd[3483]: ts=2018-03-12T12:10:03.657940Z lvl=info msg="InfluxDB starting" log_id=06nXj02G000 version=1.5.0 branch=1.5 commit=6ac835404e7e64ea7299a6eebcce1ab1ef15fe3c
Mar 12 09:10:03 userver influxd[3483]: ts=2018-03-12T12:10:03.657982Z lvl=info msg="Go runtime" log_id=06nXj02G000 version=go1.9.2 maxprocs=4
Mar 12 09:10:03 userver influxd[3483]: ts=2018-03-12T12:10:03.759523Z lvl=info msg="Using data dir" log_id=06nXj02G000 service=store path=/var/lib/influxdb/data
Mar 12 09:10:03 userver influxd[3483]: ts=2018-03-12T12:10:03.759568Z lvl=info msg="Open store (start)" log_id=06nXj02G000 service=store trace_id=06nXj0Rl000 op_name=tsdb_open op_event=start

The process is running, but no connections are accepted. The configuration file is the default one.

It was running correctly (for around 2 months) until yesterday.

Only deleting the "data" folder I can get it to start running again correctly.

1.x kinbug wontfix

Most helpful comment

Having exactly the same issue after upgrading to 1.5 on Ubuntu 16.04 (AWS) from a previous version - I did the second upgrade to 1.5.1 today morning but the results are the same. The config file is also the same for all installation. Influx does not bind to the HTTP port, started to eat 50G memory and does a big load on the machine.

All 25 comments

How much data do you have? It might be spending a long time opening the data store.

Was this an upgrade from 1.4 to 1.5?

@e-dard could this be related to any changes with the index and migrating the index?

@nettosama is this reproducible? If so, could you provide steps?

Is the log you provided the entire log?

That's the entire log I get out of influxd. There's no log in /var/log/influxdb/.

The only way to get it to start again is by deleting the /var/lib/influxdb/data/home_assistant/ folder.

This is the log when it runs after deleting that folder:

Mar 12 10:13:31 userver influxd[10973]: ts=2018-03-12T13:13:31.038145Z lvl=info msg="InfluxDB starting" log_id=06naMOcW000 version=1.5.0 branch=1.5 commit=6ac835404e7e64ea7299a6eebcce1ab1ef15fe3c
Mar 12 10:13:31 userver influxd[10973]: ts=2018-03-12T13:13:31.038175Z lvl=info msg="Go runtime" log_id=06naMOcW000 version=go1.9.2 maxprocs=4
Mar 12 10:13:31 userver influxd[10973]: ts=2018-03-12T13:13:31.138673Z lvl=info msg="Using data dir" log_id=06naMOcW000 service=store path=/var/lib/influxdb/data
Mar 12 10:13:31 userver influxd[10973]: ts=2018-03-12T13:13:31.138772Z lvl=info msg="Open store (start)" log_id=06naMOcW000 service=store trace_id=06naMP0W001 op_name=tsdb_open op_event=start
Mar 12 10:13:31 userver influxd[10973]: ts=2018-03-12T13:13:31.138809Z lvl=info msg="Open store (end)" log_id=06naMOcW000 service=store trace_id=06naMP0W001 op_name=tsdb_open op_event=end op_elapsed=0.039ms
Mar 12 10:13:31 userver influxd[10973]: ts=2018-03-12T13:13:31.138832Z lvl=info msg="Opened service" log_id=06naMOcW000 service=subscriber
Mar 12 10:13:31 userver influxd[10973]: ts=2018-03-12T13:13:31.138838Z lvl=info msg="Starting monitor service" log_id=06naMOcW000 service=monitor
Mar 12 10:13:31 userver influxd[10973]: ts=2018-03-12T13:13:31.138842Z lvl=info msg="Registered diagnostics client" log_id=06naMOcW000 service=monitor name=build
Mar 12 10:13:31 userver influxd[10973]: ts=2018-03-12T13:13:31.138846Z lvl=info msg="Registered diagnostics client" log_id=06naMOcW000 service=monitor name=runtime
Mar 12 10:13:31 userver influxd[10973]: ts=2018-03-12T13:13:31.138850Z lvl=info msg="Registered diagnostics client" log_id=06naMOcW000 service=monitor name=network
Mar 12 10:13:31 userver influxd[10973]: ts=2018-03-12T13:13:31.138860Z lvl=info msg="Registered diagnostics client" log_id=06naMOcW000 service=monitor name=system
Mar 12 10:13:31 userver influxd[10973]: ts=2018-03-12T13:13:31.138889Z lvl=info msg="Starting precreation service" log_id=06naMOcW000 service=shard-precreation check_interval=10m advance_period=30m
Mar 12 10:13:31 userver influxd[10973]: ts=2018-03-12T13:13:31.138897Z lvl=info msg="Starting snapshot service" log_id=06naMOcW000 service=snapshot
Mar 12 10:13:31 userver influxd[10973]: ts=2018-03-12T13:13:31.138901Z lvl=info msg="Starting continuous query service" log_id=06naMOcW000 service=continuous_querier
Mar 12 10:13:31 userver influxd[10973]: ts=2018-03-12T13:13:31.138905Z lvl=info msg="Starting HTTP service" log_id=06naMOcW000 service=httpd authentication=false
Mar 12 10:13:31 userver influxd[10973]: ts=2018-03-12T13:13:31.139090Z lvl=info msg="Listening on HTTP" log_id=06naMOcW000 service=httpd addr=[::]:8086 https=false
Mar 12 10:13:31 userver influxd[10973]: ts=2018-03-12T13:13:31.139126Z lvl=info msg="Starting retention policy enforcement service" log_id=06naMOcW000 service=retention check_interval=30m
Mar 12 10:13:31 userver influxd[10973]: ts=2018-03-12T13:13:31.139138Z lvl=info msg="Storing statistics" log_id=06naMOcW000 service=monitor db_instance=_internal db_rp=monitor interval=10s
Mar 12 10:13:31 userver influxd[10973]: ts=2018-03-12T13:13:31.139767Z lvl=info msg="Sending usage statistics to usage.influxdata.com" log_id=06naMOcW000
Mar 12 10:13:31 userver influxd[10973]: ts=2018-03-12T13:13:31.140205Z lvl=info msg="Listening for signals" log_id=06naMOcW000

It gets to open store (end).

I don't really know if upgrading from an earlier version was what caused this, but this is the latest upgrade:

Upgrade: python-twisted-core:amd64 (16.0.0-1, 16.0.0-1ubuntu0.2), friendly-recovery:amd64 (0.2.31, 0.2.31ubuntu1), python-twisted-bin:amd64 (16.0.0-1, 16.0.0-1ubuntu0.2), gitlab-ce:amd64 (10.5.2-ce.0, 10.5.3-ce.0), nodejs:amd64 (8.9.4-1nodesource1, 8.10.0-1nodesource1), libpq5:amd64 (9.5.11-0ubuntu0.16.04, 9.5.12-0ubuntu0.16.04), libpq-dev:amd64 (9.5.11-0ubuntu0.16.04, 9.5.12-0ubuntu0.16.04), chronograf:amd64 (1.4.1.3-1, 1.4.2.2-1), grafana:amd64 (5.0.0, 5.0.1), influxdb:amd64 (1.4.2-1, 1.5.0-1)

Update: Reverted back to the influxdb_1.4.2_amd64.deb package, and everything works correctly:

Mar 12 15:05:26 userver influxd[32297]: [I] 2018-03-12T18:05:26Z Starting snapshot service service=snapshot
Mar 12 15:05:26 userver influxd[32297]: [I] 2018-03-12T18:05:26Z Starting continuous query service service=continuous_querier
Mar 12 15:05:26 userver influxd[32297]: [I] 2018-03-12T18:05:26Z Starting HTTP service service=httpd
Mar 12 15:05:26 userver influxd[32297]: [I] 2018-03-12T18:05:26Z Authentication enabled:false service=httpd
Mar 12 15:05:26 userver influxd[32297]: [I] 2018-03-12T18:05:26Z Listening on HTTP:[::]:8086 service=httpd
Mar 12 15:05:26 userver influxd[32297]: [I] 2018-03-12T18:05:26Z Starting retention policy enforcement service with check interval of 30m0s service=retention
Mar 12 15:05:26 userver influxd[32297]: [I] 2018-03-12T18:05:26Z Listening for signals
Mar 12 15:05:26 userver influxd[32297]: [I] 2018-03-12T18:05:26Z Storing statistics in database '_internal' retention policy 'monitor', at interval 10s service=monitor
Mar 12 15:05:26 userver influxd[32297]: [I] 2018-03-12T18:05:26Z Sending usage statistics to usage.influxdata.com
Mar 12 15:07:19 userver influxd[32297]: [I] 2018-03-12T18:07:19Z SELECT "key" FROM home_assistant.autogen._series LIMIT 1 service=query

Will keep this version until a related fix is patched.

@nettosama does this happen every time you try to run 1.5.0 with the /var/lib/influxdb/data/home_assistant/ in place?

Yes. Even on a second machine (OS: Archlinux). The biggest problem I'd say is that there isn't any log to point out possible problems. Maybe it's a migration issue?

For now it's running (Ubuntu 16.04.4, 1.4.2).

I'm getting this on a 1.3.5 -> 1.5.0 upgrade as well. Rolled package back to 1.4.2 and it started right up even with no changes to index-version (I switched to tsi1 during the 1.5.0 upgrade).

-edit-
1.4.2 definitely looks to be using the tsi indices, lsof shows influx holding the index/*.tsl files open

Could these be related to #9534 / #9612 and the madvise() issue?

Having exactly the same issue after upgrading to 1.5 on Ubuntu 16.04 (AWS) from a previous version - I did the second upgrade to 1.5.1 today morning but the results are the same. The config file is also the same for all installation. Influx does not bind to the HTTP port, started to eat 50G memory and does a big load on the machine.

@csari-chimera are you running inmem or tsi1?

@e-dard it's inmem

Can anyone reproducing this share the database folder with us? We have a private method of sharing that data vs. uploading it here. We would really appreciate a way to reproduce.

@rbetts Hey, We have three TICK instances. One of them was died yesterday, another one died today morning. I have one more instance which will suffer the problem by this issue as soon as I restart the InfluxDB service on it. Just to clarify what I got, and what were the steps:

  • Machine: 2 vCPU and 8GB of RAM (AWS EC2 instance)

  • OS:
    # uname -a
    Linux ip-10-20-1-36 4.4.0-1052-aws #61-Ubuntu SMP Mon Feb 12 23:05:58 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux

  • Upgrade with apt:
    Start-Date: 2018-03-16 13:10:26
    Upgrade: influxdb:amd64 (1.4.2-1, 1.5.0-1)

  • Issue: after upgrading from 1.4.2-1 to 1.5.0-1 or to 1.5.1-1 everything seems fine, except after 3-4 days huge IO is generated on the machine, and only able to resolve it by restarting the InfluxDB. But as soon as I restart it, it does not bind anymore to the HTTP port (I did a debug level setting in the conf, but I never had success to have appropriate logs from Influx, and it doesn't matter what I do, it goes to the syslog). To "repair" the issue I'm deleting the content of the data folder, restart InfluxDB and works again.

Config file for InfluxDB is attached (as txt...): influxdb.conf.txt

What other information or anything else do you need?

EDIT: fixing typos, grammar, removing shared secret from the conf file.

I ran into this exact same issue, with starting "hanging" after "Open store (start)". The problem ended up being that I naively ran the 'influxd_inspect buildtsi' command as root, and not as the influxdb user. This resulted in the new ts1 index files being owned as root. Although they were readable, they were not writeable.

YMMV, but recursively chowning my data directory resolved this for me and allowed InfluxDB 1.5.0 to start.

@spolyack thanks for the info. What did the ifluxdb log show in that case?

/cc @e-dard

Updated
same issue here, I updated from 1.1.0 with enabling tsi, I had the "running the influxd_inspect buildtsi as root" problem so I changed the owner that fixed the issue.
Last message was:

Apr 13 07:42:41 ------- influxd[26803]: ts=2018-04-13T07:42:41.651852Z lvl=info msg="Open store (start)" log_id=07RV8sxG000 service=store trace_id=07RV8sxl000 op_name=tsdb_open op_event=start

maybe a permission denied in the log or some check would help

now influxdb is started but its in a loop, it keeps repeating this block:

Apr 13 08:17:51 influxhost systemd[1]: influxdb.service: main process exited, code=exited, status=1/FAILURE
Apr 13 08:17:51 influxhost systemd[1]: Unit influxdb.service entered failed state.
Apr 13 08:17:51 influxhost systemd[1]: influxdb.service failed.
Apr 13 08:17:52 influxhost systemd[1]: influxdb.service holdoff time over, scheduling restart.
Apr 13 08:17:52 influxhost systemd[1]: Started InfluxDB is an open-source, distributed, time series database.
Apr 13 08:17:52 influxhost systemd[1]: Starting InfluxDB is an open-source, distributed, time series database...
Apr 13 08:17:52 influxhost influxd[32500]: ts=2018-04-13T08:17:52.143740Z lvl=info msg="InfluxDB starting" log_id=07RX9h3l000 version=1.5.1 branch=1.5 commit=cdae4ccde4c67c3390d8ae8a1a06bd3b4cdce5c5
Apr 13 08:17:52 influxhost influxd[32500]: ts=2018-04-13T08:17:52.143778Z lvl=info msg="Go runtime" log_id=07RX9h3l000 version=go1.9.2 maxprocs=8
Apr 13 08:17:52 influxhost influxd[32500]: ts=2018-04-13T08:17:52.145250Z lvl=info msg="Using data dir" log_id=07RX9h3l000 service=store path=/var/lib/influxdb/data
Apr 13 08:17:52 influxhost influxd[32500]: ts=2018-04-13T08:17:52.145287Z lvl=info msg="Open store (start)" log_id=07RX9h3l000 service=store trace_id=07RX9h4G000 op_name=tsdb_open op_event=start
Apr 13 08:17:52 influxhost influxd[32500]: ts=2018-04-13T08:17:52.159036Z lvl=info msg="tsm1 WAL starting" log_id=07RX9h3l000 engine=tsm1 service=wal segment_size=10485760
Apr 13 08:17:52 influxhost influxd[32500]: ts=2018-04-13T08:17:52.159079Z lvl=info msg="tsm1 WAL writing" log_id=07RX9h3l000 engine=tsm1 service=wal path=/var/lib/influxdb/wal/_internal/monitor/1316
Apr 13 08:17:52 influxhost influxd[32500]: ts=2018-04-13T08:17:52.159091Z lvl=info msg="tsm1 WAL starting" log_id=07RX9h3l000 engine=tsm1 service=wal segment_size=10485760
Apr 13 08:17:52 influxhost influxd[32500]: ts=2018-04-13T08:17:52.159107Z lvl=info msg="tsm1 WAL writing" log_id=07RX9h3l000 engine=tsm1 service=wal path=/var/lib/influxdb/wal/_internal/monitor/1315
Apr 13 08:17:52 influxhost influxd[32500]: ts=2018-04-13T08:17:52.159738Z lvl=info msg="tsm1 WAL starting" log_id=07RX9h3l000 engine=tsm1 service=wal segment_size=10485760
Apr 13 08:17:52 influxhost influxd[32500]: ts=2018-04-13T08:17:52.159757Z lvl=info msg="tsm1 WAL writing" log_id=07RX9h3l000 engine=tsm1 service=wal path=/var/lib/influxdb/wal/_internal/monitor/1318
Apr 13 08:17:52 influxhost influxd[32500]: ts=2018-04-13T08:17:52.159745Z lvl=info msg="Opened file" log_id=07RX9h3l000 engine=tsm1 service=filestore path=/var/lib/influxdb/data/_internal/monitor/1315/000000005-000000002.tsm id=0 duration=0.577ms
Apr 13 08:17:52 influxhost influxd[32500]: ts=2018-04-13T08:17:52.160320Z lvl=info msg="Reloaded WAL cache" log_id=07RX9h3l000 engine=tsm1 path=/var/lib/influxdb/wal/_internal/monitor/1315 duration=0.164ms
Apr 13 08:17:52 influxhost influxd[32500]: ts=2018-04-13T08:17:52.160349Z lvl=info msg="Opened file" log_id=07RX9h3l000 engine=tsm1 service=filestore path=/var/lib/influxdb/data/_internal/monitor/1316/000000005-000000002.tsm id=0 duration=1.183ms
Apr 13 08:17:52 influxhost influxd[32500]: ts=2018-04-13T08:17:52.160369Z lvl=info msg="tsm1 WAL starting" log_id=07RX9h3l000 engine=tsm1 service=wal segment_size=10485760
Apr 13 08:17:52 influxhost influxd[32500]: ts=2018-04-13T08:17:52.160383Z lvl=info msg="tsm1 WAL writing" log_id=07RX9h3l000 engine=tsm1 service=wal path=/var/lib/influxdb/wal/_internal/monitor/1313
Apr 13 08:17:52 influxhost influxd[32500]: ts=2018-04-13T08:17:52.160424Z lvl=info msg="Reloaded WAL cache" log_id=07RX9h3l000 engine=tsm1 path=/var/lib/influxdb/wal/_internal/monitor/1316 duration=0.038ms
Apr 13 08:17:52 influxhost influxd[32500]: ts=2018-04-13T08:17:52.160484Z lvl=info msg="tsm1 WAL starting" log_id=07RX9h3l000 engine=tsm1 service=wal segment_size=10485760
Apr 13 08:17:52 influxhost influxd[32500]: ts=2018-04-13T08:17:52.160502Z lvl=info msg="tsm1 WAL writing" log_id=07RX9h3l000 engine=tsm1 service=wal path=/var/lib/influxdb/wal/_internal/monitor/1314
Apr 13 08:17:52 influxhost influxd[32500]: ts=2018-04-13T08:17:52.160882Z lvl=info msg="Opened file" log_id=07RX9h3l000 engine=tsm1 service=filestore path=/var/lib/influxdb/data/_internal/monitor/1318/000000005-000000002.tsm id=0 duration=0.729ms
Apr 13 08:17:52 influxhost influxd[32500]: ts=2018-04-13T08:17:52.160965Z lvl=info msg="Reloaded WAL cache" log_id=07RX9h3l000 engine=tsm1 path=/var/lib/influxdb/wal/_internal/monitor/1318 duration=0.034ms
Apr 13 08:17:52 influxhost influxd[32500]: ts=2018-04-13T08:17:52.161259Z lvl=info msg="Opened file" log_id=07RX9h3l000 engine=tsm1 service=filestore path=/var/lib/influxdb/data/_internal/monitor/1314/000000005-000000002.tsm id=0 duration=0.618ms
Apr 13 08:17:52 influxhost influxd[32500]: ts=2018-04-13T08:17:52.161337Z lvl=info msg="Reloaded WAL cache" log_id=07RX9h3l000 engine=tsm1 path=/var/lib/influxdb/wal/_internal/monitor/1314 duration=0.039ms
Apr 13 08:17:52 influxhost influxd[32500]: ts=2018-04-13T08:17:52.161767Z lvl=info msg="tsm1 WAL starting" log_id=07RX9h3l000 engine=tsm1 service=wal segment_size=10485760
Apr 13 08:17:52 influxhost influxd[32500]: ts=2018-04-13T08:17:52.161790Z lvl=info msg="tsm1 WAL writing" log_id=07RX9h3l000 engine=tsm1 service=wal path=/var/lib/influxdb/wal/_internal/monitor/1319
Apr 13 08:17:52 influxhost influxd[32500]: ts=2018-04-13T08:17:52.162105Z lvl=info msg="tsm1 WAL starting" log_id=07RX9h3l000 engine=tsm1 service=wal segment_size=10485760
Apr 13 08:17:52 influxhost influxd[32500]: ts=2018-04-13T08:17:52.162118Z lvl=info msg="tsm1 WAL writing" log_id=07RX9h3l000 engine=tsm1 service=wal path=/var/lib/influxdb/wal/_internal/monitor/1320
Apr 13 08:17:52 influxhost influxd[32500]: ts=2018-04-13T08:17:52.162207Z lvl=info msg="tsm1 WAL starting" log_id=07RX9h3l000 engine=tsm1 service=wal segment_size=10485760
Apr 13 08:17:52 influxhost influxd[32500]: ts=2018-04-13T08:17:52.162223Z lvl=info msg="tsm1 WAL writing" log_id=07RX9h3l000 engine=tsm1 service=wal path=/var/lib/influxdb/wal/_internal/monitor/1317
Apr 13 08:17:52 influxhost influxd[32500]: ts=2018-04-13T08:17:52.162283Z lvl=info msg="Opened file" log_id=07RX9h3l000 engine=tsm1 service=filestore path=/var/lib/influxdb/data/_internal/monitor/1313/000000005-000000002.tsm id=0 duration=1.813ms
Apr 13 08:17:52 influxhost influxd[32500]: ts=2018-04-13T08:17:52.162449Z lvl=info msg="Reloaded WAL cache" log_id=07RX9h3l000 engine=tsm1 path=/var/lib/influxdb/wal/_internal/monitor/1313 duration=0.122ms
Apr 13 08:17:52 influxhost influxd[32500]: ts=2018-04-13T08:17:52.162945Z lvl=info msg="Opened file" log_id=07RX9h3l000 engine=tsm1 service=filestore path=/var/lib/influxdb/data/_internal/monitor/1319/000000006-000000002.tsm id=0 duration=1.075ms
Apr 13 08:17:52 influxhost influxd[32500]: ts=2018-04-13T08:17:52.163167Z lvl=info msg="Reloaded WAL cache" log_id=07RX9h3l000 engine=tsm1 path=/var/lib/influxdb/wal/_internal/monitor/1319 duration=0.186ms
Apr 13 08:17:52 influxhost influxd[32500]: ts=2018-04-13T08:17:52.163177Z lvl=info msg="Opened file" log_id=07RX9h3l000 engine=tsm1 service=filestore path=/var/lib/influxdb/data/_internal/monitor/1320/000000001-000000001.tsm id=0 duration=0.943ms
Apr 13 08:17:52 influxhost influxd[32500]: ts=2018-04-13T08:17:52.163262Z lvl=info msg="Reading file" log_id=07RX9h3l000 engine=tsm1 service=cacheloader path=/var/lib/influxdb/wal/_internal/monitor/1320/_00005.wal size=10226279
Apr 13 08:17:52 influxhost influxd[32500]: ts=2018-04-13T08:17:52.166402Z lvl=info msg="Opened file" log_id=07RX9h3l000 engine=tsm1 service=filestore path=/var/lib/influxdb/data/_internal/monitor/1317/000000005-000000002.tsm id=0 duration=3.871ms
Apr 13 08:17:52 influxhost influxd[32500]: ts=2018-04-13T08:17:52.166490Z lvl=info msg="Reloaded WAL cache" log_id=07RX9h3l000 engine=tsm1 path=/var/lib/influxdb/wal/_internal/monitor/1317 duration=0.035ms
Apr 13 08:17:52 influxhost influxd[32500]: ts=2018-04-13T08:17:52.850889Z lvl=info msg="Reloaded WAL cache" log_id=07RX9h3l000 engine=tsm1 path=/var/lib/influxdb/wal/_internal/monitor/1320 duration=687.666ms
Apr 13 08:17:53 influxhost influxd[32500]: ts=2018-04-13T08:17:53.640322Z lvl=info msg="Opened shard" log_id=07RX9h3l000 service=store trace_id=07RX9h4G000 op_name=tsdb_open path=/var/lib/influxdb/data/_internal/monitor/1320 duration=1490.239ms
Apr 13 08:17:53 influxhost influxd[32500]: ts=2018-04-13T08:17:53.640437Z lvl=info msg="Opened shard" log_id=07RX9h3l000 service=store trace_id=07RX9h4G000 op_name=tsdb_open path=/var/lib/influxdb/data/_internal/monitor/1317 duration=1490.426ms
Apr 13 08:17:53 influxhost influxd[32500]: ts=2018-04-13T08:17:53.640487Z lvl=info msg="Opened shard" log_id=07RX9h3l000 service=store trace_id=07RX9h4G000 op_name=tsdb_open path=/var/lib/influxdb/data/_internal/monitor/1313 duration=1493.130ms
Apr 13 08:17:53 influxhost influxd[32500]: ts=2018-04-13T08:17:53.640477Z lvl=info msg="Opened shard" log_id=07RX9h3l000 service=store trace_id=07RX9h4G000 op_name=tsdb_open path=/var/lib/influxdb/data/_internal/monitor/1315 duration=1491.979ms
Apr 13 08:17:53 influxhost influxd[32500]: ts=2018-04-13T08:17:53.640427Z lvl=info msg="Opened shard" log_id=07RX9h3l000 service=store trace_id=07RX9h4G000 op_name=tsdb_open path=/var/lib/influxdb/data/_internal/monitor/1316 duration=1493.664ms
Apr 13 08:17:53 influxhost influxd[32500]: ts=2018-04-13T08:17:53.641693Z lvl=info msg="Opened shard" log_id=07RX9h3l000 service=store trace_id=07RX9h4G000 op_name=tsdb_open path=/var/lib/influxdb/data/_internal/monitor/1318 duration=1491.742ms
Apr 13 08:17:53 influxhost influxd[32500]: ts=2018-04-13T08:17:53.640397Z lvl=info msg="Opened shard" log_id=07RX9h3l000 service=store trace_id=07RX9h4G000 op_name=tsdb_open path=/var/lib/influxdb/data/_internal/monitor/1314 duration=1493.297ms
Apr 13 08:17:53 influxhost influxd[32500]: ts=2018-04-13T08:17:53.643018Z lvl=info msg="tsm1 WAL starting" log_id=07RX9h3l000 engine=tsm1 service=wal segment_size=10485760
Apr 13 08:17:53 influxhost influxd[32500]: ts=2018-04-13T08:17:53.643035Z lvl=info msg="tsm1 WAL writing" log_id=07RX9h3l000 engine=tsm1 service=wal path=/var/lib/influxdb/wal/cobra/autogen/999
Apr 13 08:17:53 influxhost influxd[32500]: ts=2018-04-13T08:17:53.643476Z lvl=info msg="Opened file" log_id=07RX9h3l000 engine=tsm1 service=filestore path=/var/lib/influxdb/data/cobra/autogen/999/000000002-000000002.tsm id=0 duration=0.272ms
Apr 13 08:17:53 influxhost influxd[32500]: ts=2018-04-13T08:17:53.643595Z lvl=info msg="Reloaded WAL cache" log_id=07RX9h3l000 engine=tsm1 path=/var/lib/influxdb/wal/cobra/autogen/999 duration=0.039ms
Apr 13 08:17:53 influxhost influxd[32500]: ts=2018-04-13T08:17:53.643636Z lvl=info msg="Opened shard" log_id=07RX9h3l000 service=store trace_id=07RX9h4G000 op_name=tsdb_open path=/var/lib/influxdb/data/cobra/autogen/999 duration=3.255ms
Apr 13 08:17:53 influxhost influxd[32500]: ts=2018-04-13T08:17:53.640388Z lvl=info msg="Opened shard" log_id=07RX9h3l000 service=store trace_id=07RX9h4G000 op_name=tsdb_open path=/var/lib/influxdb/data/_internal/monitor/1319 duration=1490.384ms
Apr 13 08:17:53 influxhost influxd[32500]: ts=2018-04-13T08:17:53.678537Z lvl=info msg="tsm1 WAL starting" log_id=07RX9h3l000 engine=tsm1 service=wal segment_size=10485760
Apr 13 08:17:53 influxhost influxd[32500]: ts=2018-04-13T08:17:53.678566Z lvl=info msg="tsm1 WAL writing" log_id=07RX9h3l000 engine=tsm1 service=wal path=/var/lib/influxdb/wal/cobra/autogen/968
Apr 13 08:17:53 influxhost influxd[32500]: ts=2018-04-13T08:17:53.680394Z lvl=info msg="Opened file" log_id=07RX9h3l000 engine=tsm1 service=filestore path=/var/lib/influxdb/data/cobra/autogen/968/000000393-000000004.tsm id=0 duration=1.719ms
Apr 13 08:17:53 influxhost influxd[32500]: ts=2018-04-13T08:17:53.680631Z lvl=info msg="Reloaded WAL cache" log_id=07RX9h3l000 engine=tsm1 path=/var/lib/influxdb/wal/cobra/autogen/968 duration=0.182ms
Apr 13 08:17:53 influxhost influxd[32500]: ts=2018-04-13T08:17:53.680654Z lvl=info msg="Opened shard" log_id=07RX9h3l000 service=store trace_id=07RX9h4G000 op_name=tsdb_open path=/var/lib/influxdb/data/cobra/autogen/968 duration=40.143ms
Apr 13 08:17:53 influxhost influxd[32500]: ts=2018-04-13T08:17:53.681376Z lvl=info msg="tsm1 WAL starting" log_id=07RX9h3l000 engine=tsm1 service=wal segment_size=10485760
Apr 13 08:17:53 influxhost influxd[32500]: ts=2018-04-13T08:17:53.681402Z lvl=info msg="tsm1 WAL writing" log_id=07RX9h3l000 engine=tsm1 service=wal path=/var/lib/influxdb/wal/cobra/autogen/977
Apr 13 08:17:53 influxhost influxd[32500]: ts=2018-04-13T08:17:53.683553Z lvl=info msg="Opened file" log_id=07RX9h3l000 engine=tsm1 service=filestore path=/var/lib/influxdb/data/cobra/autogen/977/000000374-000000004.tsm id=0 duration=1.959ms
Apr 13 08:17:53 influxhost influxd[32500]: ts=2018-04-13T08:17:53.683672Z lvl=info msg="Reloaded WAL cache" log_id=07RX9h3l000 engine=tsm1 path=/var/lib/influxdb/wal/cobra/autogen/977 duration=0.041ms
Apr 13 08:17:53 influxhost influxd[32500]: ts=2018-04-13T08:17:53.683714Z lvl=info msg="Opened shard" log_id=07RX9h3l000 service=store trace_id=07RX9h4G000 op_name=tsdb_open path=/var/lib/influxdb/data/cobra/autogen/977 duration=43.188ms
Apr 13 08:17:53 influxhost influxd[32500]: ts=2018-04-13T08:17:53.719234Z lvl=info msg="tsm1 WAL starting" log_id=07RX9h3l000 engine=tsm1 service=wal segment_size=10485760
Apr 13 08:17:53 influxhost influxd[32500]: ts=2018-04-13T08:17:53.719265Z lvl=info msg="tsm1 WAL writing" log_id=07RX9h3l000 engine=tsm1 service=wal path=/var/lib/influxdb/wal/cobra/autogen/959
Apr 13 08:17:53 influxhost influxd[32500]: ts=2018-04-13T08:17:53.722061Z lvl=info msg="Opened file" log_id=07RX9h3l000 engine=tsm1 service=filestore path=/var/lib/influxdb/data/cobra/autogen/959/000000395-000000007.tsm id=0 duration=2.717ms
Apr 13 08:17:53 influxhost influxd[32500]: ts=2018-04-13T08:17:53.722154Z lvl=info msg="Reloaded WAL cache" log_id=07RX9h3l000 engine=tsm1 path=/var/lib/influxdb/wal/cobra/autogen/959 duration=0.034ms
Apr 13 08:17:53 influxhost influxd[32500]: ts=2018-04-13T08:17:53.722173Z lvl=info msg="Opened shard" log_id=07RX9h3l000 service=store trace_id=07RX9h4G000 op_name=tsdb_open path=/var/lib/influxdb/data/cobra/autogen/959 duration=81.669ms
Apr 13 08:17:53 influxhost influxd[32500]: ts=2018-04-13T08:17:53.851976Z lvl=info msg="tsm1 WAL starting" log_id=07RX9h3l000 engine=tsm1 service=wal segment_size=10485760
Apr 13 08:17:53 influxhost influxd[32500]: ts=2018-04-13T08:17:53.852017Z lvl=info msg="tsm1 WAL writing" log_id=07RX9h3l000 engine=tsm1 service=wal path=/var/lib/influxdb/wal/cobra/autogen/936
Apr 13 08:17:53 influxhost influxd[32500]: ts=2018-04-13T08:17:53.852799Z lvl=info msg="Opened file" log_id=07RX9h3l000 engine=tsm1 service=filestore path=/var/lib/influxdb/data/cobra/autogen/936/000000964-000000007.tsm id=2 duration=0.653ms
Apr 13 08:17:53 influxhost influxd[32500]: ts=2018-04-13T08:17:53.855835Z lvl=info msg="Opened file" log_id=07RX9h3l000 engine=tsm1 service=filestore path=/var/lib/influxdb/data/cobra/autogen/936/000000964-000000005.tsm id=0 duration=2.983ms
Apr 13 08:17:53 influxhost influxd[32500]: ts=2018-04-13T08:17:53.858786Z lvl=info msg="Opened file" log_id=07RX9h3l000 engine=tsm1 service=filestore path=/var/lib/influxdb/data/cobra/autogen/936/000000964-000000006.tsm id=1 duration=2.912ms
Apr 13 08:17:53 influxhost influxd[32500]: ts=2018-04-13T08:17:53.858857Z lvl=info msg="Reloaded WAL cache" log_id=07RX9h3l000 engine=tsm1 path=/var/lib/influxdb/wal/cobra/autogen/936 duration=0.030ms
Apr 13 08:17:53 influxhost influxd[32500]: ts=2018-04-13T08:17:53.858878Z lvl=info msg="Opened shard" log_id=07RX9h3l000 service=store trace_id=07RX9h4G000 op_name=tsdb_open path=/var/lib/influxdb/data/cobra/autogen/936 duration=218.080ms
Apr 13 08:17:53 influxhost influxd[32500]: ts=2018-04-13T08:17:53.983137Z lvl=info msg="tsm1 WAL starting" log_id=07RX9h3l000 engine=tsm1 service=wal segment_size=10485760
Apr 13 08:17:53 influxhost influxd[32500]: ts=2018-04-13T08:17:53.983179Z lvl=info msg="tsm1 WAL writing" log_id=07RX9h3l000 engine=tsm1 service=wal path=/var/lib/influxdb/wal/cobra/autogen/922
Apr 13 08:17:53 influxhost influxd[32500]: ts=2018-04-13T08:17:53.984094Z lvl=info msg="Opened file" log_id=07RX9h3l000 engine=tsm1 service=filestore path=/var/lib/influxdb/data/cobra/autogen/922/000001503-000000012.tsm id=4 duration=0.795ms
Apr 13 08:17:53 influxhost influxd[32500]: ts=2018-04-13T08:17:53.987969Z lvl=info msg="Opened file" log_id=07RX9h3l000 engine=tsm1 service=filestore path=/var/lib/influxdb/data/cobra/autogen/922/000001503-000000008.tsm id=0 duration=3.816ms
Apr 13 08:17:53 influxhost influxd[32500]: ts=2018-04-13T08:17:53.992120Z lvl=info msg="Opened file" log_id=07RX9h3l000 engine=tsm1 service=filestore path=/var/lib/influxdb/data/cobra/autogen/922/000001503-000000009.tsm id=1 duration=4.106ms
Apr 13 08:17:53 influxhost influxd[32500]: ts=2018-04-13T08:17:53.996108Z lvl=info msg="Opened file" log_id=07RX9h3l000 engine=tsm1 service=filestore path=/var/lib/influxdb/data/cobra/autogen/922/000001503-000000010.tsm id=2 duration=3.944ms
Apr 13 08:17:54 influxhost influxd[32500]: ts=2018-04-13T08:17:53.999680Z lvl=info msg="Opened file" log_id=07RX9h3l000 engine=tsm1 service=filestore path=/var/lib/influxdb/data/cobra/autogen/922/000001503-000000011.tsm id=3 duration=3.525ms
Apr 13 08:17:54 influxhost influxd[32500]: ts=2018-04-13T08:17:53.999773Z lvl=info msg="Reloaded WAL cache" log_id=07RX9h3l000 engine=tsm1 path=/var/lib/influxdb/wal/cobra/autogen/922 duration=0.046ms
Apr 13 08:17:54 influxhost influxd[32500]: ts=2018-04-13T08:17:53.999796Z lvl=info msg="Opened shard" log_id=07RX9h3l000 service=store trace_id=07RX9h4G000 op_name=tsdb_open path=/var/lib/influxdb/data/cobra/autogen/922 duration=357.420ms
Apr 13 08:17:56 influxhost influxd[32500]: ts=2018-04-13T08:17:56.358154Z lvl=info msg="tsm1 WAL starting" log_id=07RX9h3l000 engine=tsm1 service=wal segment_size=10485760
Apr 13 08:17:56 influxhost influxd[32500]: ts=2018-04-13T08:17:56.358197Z lvl=info msg="tsm1 WAL writing" log_id=07RX9h3l000 engine=tsm1 service=wal path=/var/lib/influxdb/wal/cobra/autogen/921
Apr 13 08:17:56 influxhost influxd[32500]: ts=2018-04-13T08:17:56.364566Z lvl=info msg="Opened file" log_id=07RX9h3l000 engine=tsm1 service=filestore path=/var/lib/influxdb/data/cobra/autogen/921/000004729-000000031.tsm id=14 duration=6.104ms
Apr 13 08:17:56 influxhost influxd[32500]: ts=2018-04-13T08:17:56.368528Z lvl=info msg="Opened file" log_id=07RX9h3l000 engine=tsm1 service=filestore path=/var/lib/influxdb/data/cobra/autogen/921/000004729-000000021.tsm id=4 duration=9.966ms
Apr 13 08:17:56 influxhost influxd[32500]: ts=2018-04-13T08:17:56.372097Z lvl=info msg="Opened file" log_id=07RX9h3l000 engine=tsm1 service=filestore path=/var/lib/influxdb/data/cobra/autogen/921/000004729-000000017.tsm id=0 duration=13.679ms
Apr 13 08:17:56 influxhost influxd[32500]: ts=2018-04-13T08:17:56.373603Z lvl=info msg="Opened file" log_id=07RX9h3l000 engine=tsm1 service=filestore path=/var/lib/influxdb/data/cobra/autogen/921/000004729-000000022.tsm id=5 duration=14.938ms
Apr 13 08:17:56 influxhost influxd[32500]: ts=2018-04-13T08:17:56.378384Z lvl=info msg="Opened file" log_id=07RX9h3l000 engine=tsm1 service=filestore path=/var/lib/influxdb/data/cobra/autogen/921/000004729-000000018.tsm id=1 duration=19.779ms
Apr 13 08:17:56 influxhost influxd[32500]: ts=2018-04-13T08:17:56.380141Z lvl=info msg="Opened file" log_id=07RX9h3l000 engine=tsm1 service=filestore path=/var/lib/influxdb/data/cobra/autogen/921/000004729-000000019.tsm id=2 duration=21.507ms
Apr 13 08:17:56 influxhost influxd[32500]: ts=2018-04-13T08:17:56.386476Z lvl=info msg="Opened file" log_id=07RX9h3l000 engine=tsm1 service=filestore path=/var/lib/influxdb/data/cobra/autogen/921/000004729-000000027.tsm id=10 duration=17.845ms
Apr 13 08:17:56 influxhost influxd[32500]: ts=2018-04-13T08:17:56.387623Z lvl=info msg="Opened file" log_id=07RX9h3l000 engine=tsm1 service=filestore path=/var/lib/influxdb/data/cobra/autogen/921/000004729-000000020.tsm id=3 duration=29.129ms
Apr 13 08:17:56 influxhost influxd[32500]: ts=2018-04-13T08:17:56.387749Z lvl=info msg="Opened file" log_id=07RX9h3l000 engine=tsm1 service=filestore path=/var/lib/influxdb/data/cobra/autogen/921/000004729-000000024.tsm id=7 duration=23.067ms
Apr 13 08:17:56 influxhost influxd[32500]: ts=2018-04-13T08:17:56.387887Z lvl=info msg="Opened file" log_id=07RX9h3l000 engine=tsm1 service=filestore path=/var/lib/influxdb/data/cobra/autogen/921/000004729-000000025.tsm id=8 duration=15.715ms
Apr 13 08:17:56 influxhost influxd[32500]: ts=2018-04-13T08:17:56.388280Z lvl=info msg="Opened file" log_id=07RX9h3l000 engine=tsm1 service=filestore path=/var/lib/influxdb/data/cobra/autogen/921/000004729-000000029.tsm id=12 duration=14.622ms
Apr 13 08:17:56 influxhost influxd[32500]: ts=2018-04-13T08:17:56.396658Z lvl=info msg="Opened file" log_id=07RX9h3l000 engine=tsm1 service=filestore path=/var/lib/influxdb/data/cobra/autogen/921/000004729-000000028.tsm id=11 duration=18.182ms
Apr 13 08:17:56 influxhost influxd[32500]: ts=2018-04-13T08:17:56.398693Z lvl=info msg="Opened file" log_id=07RX9h3l000 engine=tsm1 service=filestore path=/var/lib/influxdb/data/cobra/autogen/921/000004729-000000030.tsm id=13 duration=18.472ms
Apr 13 08:17:56 influxhost influxd[32500]: ts=2018-04-13T08:17:56.398956Z lvl=info msg="Opened file" log_id=07RX9h3l000 engine=tsm1 service=filestore path=/var/lib/influxdb/data/cobra/autogen/921/000004729-000000026.tsm id=9 duration=15.415ms
Apr 13 08:17:56 influxhost influxd[32500]: ts=2018-04-13T08:17:56.401195Z lvl=info msg="Opened file" log_id=07RX9h3l000 engine=tsm1 service=filestore path=/var/lib/influxdb/data/cobra/autogen/921/000004729-000000023.tsm id=6 duration=42.725ms
Apr 13 08:17:56 influxhost influxd[32500]: ts=2018-04-13T08:17:56.401318Z lvl=info msg="Reloaded WAL cache" log_id=07RX9h3l000 engine=tsm1 path=/var/lib/influxdb/wal/cobra/autogen/921 duration=0.059ms
Apr 13 08:17:56 influxhost influxd[32500]: ts=2018-04-13T08:17:56.401340Z lvl=info msg="Opened shard" log_id=07RX9h3l000 service=store trace_id=07RX9h4G000 op_name=tsdb_open path=/var/lib/influxdb/data/cobra/autogen/921 duration=2759.625ms
Apr 13 08:17:56 influxhost influxd[32500]: ts=2018-04-13T08:17:56.413173Z lvl=info msg="Open store (end)" log_id=07RX9h3l000 service=store trace_id=07RX9h4G000 op_name=tsdb_open op_event=end op_elapsed=4267.885ms
Apr 13 08:17:56 influxhost systemd[1]: influxdb.service: main process exited, code=exited, status=1/FAILURE
Apr 13 08:17:56 influxhost systemd[1]: Unit influxdb.service entered failed state.
Apr 13 08:17:56 influxhost systemd[1]: influxdb.service failed.
Apr 13 08:17:56 influxhost systemd[1]: influxdb.service holdoff time over, scheduling restart.
Apr 13 08:17:56 influxhost systemd[1]: Started InfluxDB is an open-source, distributed, time series database.
Apr 13 08:17:56 influxhost systemd[1]: Starting InfluxDB is an open-source, distributed, time series database...

I do not see anything obvious here, but I vaguely remember having this issue before with a much older version of influx

journalctl did not have the answer but /var/log/messages did it seems the /usr/share/collectd/types.db was missing from my setup somehow, I added that now everything seems to work

@csari-chimera did recursively chowning the data directory with the correct user (I guess influxdb:influxdb or whatever use runs the dB) fix the problem?

@e-dard Nope, since I only do an upgrade - not other stuff was executed like no inmem-tsi1 migration.

I'm getting this bug with InfluxDB 1.2.4 on Centos6
It was running correctly (for around 1 year) until last week.
Only deleting my data folder I can get it to start running again correctly.
And It doesn't work when I upgrate the influxDB to 1.5.2.

We're also getting this bug with CentOS 7.5 and InfluxDB 1.5.2.

  • InfluxDB is hammering the CPU sending info level logs to journald.
  • All files in /var/lib/influxdb are owned by the influxdb user.
  • InfluxDB has not created a log in the /var/log/influxdb directory despite having this specified as the log location.
  • There are no errors in journald for influxdb.

@sammcj you mention it's hammering the CPU by sending lots of logs, but you don't see any of the logs? Did you maybe take profiles to determine this?

I'm seeing this bug with InfluxDB 1.7.2.

Any news on this? We have the same problem as of today (that we noticed). RHEL7, InfluxDB 1.6.3. Updated to 1.7.3 to no effect. I have errors in the journal, here and there, though it seems not reproducably after a restart:
Jan 14 14:30:55 ... influxd[11452]: ts=2019-01-14T13:30:55.051470Z lvl=info msg="Series partition compaction (start)" log_id=0C~6qFul000 partition=4 trac
e_id=0C~8lrnl000 op_name=series_partition_compaction path=/web/databases/influxdb/data/icinga2/_series/04 op_event=start
Jan 14 14:30:55 ... influxd[11452]: fatal error: runtime: out of memory
Jan 14 14:30:55 ... influxd[11452]: runtime stack:
Jan 14 14:30:55 ... influxd[11452]: runtime.throw(0x123d974, 0x16)
Jan 14 14:30:55 ... influxd[11452]: /usr/local/go/src/runtime/panic.go:608 +0x72
Jan 14 14:30:55 ... influxd[11452]: runtime.sysMap(0xe664000000, 0x154000000, 0x1da9518)
Jan 14 14:30:55 ... influxd[11452]: /usr/local/go/src/runtime/mem_linux.go:156 +0xc7
Jan 14 14:30:55 ... influxd[11452]: runtime.(mheap).sysAlloc(0x1d8f780, 0x154000000, 0x0, 0x40e1d7)
Jan 14 14:30:55 ... influxd[11452]: /usr/local/go/src/runtime/malloc.go:619 +0x1c7
Jan 14 14:30:55 ... influxd[11452]: runtime.(
mheap).grow(0x1d8f780, 0xaa000, 0x0)
Jan 14 14:30:55 ... influxd[11452]: /usr/local/go/src/runtime/mheap.go:920 +0x42
Jan 14 14:30:55 ... influxd[11452]: runtime.(mheap).allocSpanLocked(0x1d8f780, 0xaa000, 0x1da9528, 0x0)
Jan 14 14:30:55 ... influxd[11452]: /usr/local/go/src/runtime/mheap.go:848 +0x337
Jan 14 14:30:55 ... influxd[11452]: runtime.(
mheap).alloc_m(0x1d8f780, 0xaa000, 0x101, 0x149ea88)

So far only soluation seems to be deleting all data and starting fresh. Could this be size related? The data dir is ~260G in size now.

As a followup, I tried updating once again after a year, now to version 1.7.4, but I got the same error, so I'm still running version 1.4.2

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

This issue has been automatically closed because it has not had recent activity. Please reopen if this issue is still important to you. Thank you for your contributions.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

udf2457 picture udf2457  路  3Comments

ricco24 picture ricco24  路  3Comments

deepujain picture deepujain  路  3Comments

airyland picture airyland  路  3Comments

jonira picture jonira  路  4Comments