6.7.0CentOS 7.6.1810journalbeat.inputs:
- paths: []
seek: cursor
output.redis:
enabled: true
hosts:
- my-redis:6379
ssl.enabled: false
fields:
product: foo
ecosystem: bar
env: baz
role: mysystem
It seems that when journalbeat is restarted instead of picking up where it left off it goes back much further (by 10's of thousands of events) and therefore you get a bit of a spike in traffic as well as MANY duplicated logs. If using Redis like we are this can also overwhelm your Redis box.
Reproduce:
journalbeat.If started interactively it seems to work OK:
# /usr/share/journalbeat/bin/journalbeat -c /etc/journalbeat/journalbeat.yml -path.home /usr/share/journalbeat -path.config /etc/journalbeat -path.data /var/lib/journalbeat -path.logs /var/log/journalbeat -d beat,input
2019-03-28T02:12:10.381Z INFO instance/beat.go:280 Setup Beat: journalbeat; Version: 6.7.0
2019-03-28T02:12:10.381Z DEBUG [beat] instance/beat.go:301 Initializing output plugins
2019-03-28T02:12:10.394Z INFO [publisher] pipeline/module.go:110 Beat name: myhostname
2019-03-28T02:12:10.394Z WARN [cfgwarn] beater/journalbeat.go:50 EXPERIMENTAL: Journalbeat is experimental.
2019-03-28T02:12:10.418Z DEBUG [input] reader/journal.go:87 New local journal is opened for reading {"id": "afbf471a-6ceb-469b-bd76-1b3d1eb49a8b", "path": "local"}
2019-03-28T02:12:10.418Z DEBUG [input] reader/journal.go:169 Seeked to position defined in cursor {"id": "afbf471a-6ceb-469b-bd76-1b3d1eb49a8b", "path": "local"}
2019-03-28T02:12:10.418Z DEBUG [input] input/input.go:108 New input is created for paths [] {"id": "afbf471a-6ceb-469b-bd76-1b3d1eb49a8b"}
2019-03-28T02:12:10.419Z INFO instance/beat.go:402 journalbeat start running.
2019-03-28T02:12:10.419Z INFO [journalbeat] beater/journalbeat.go:104 journalbeat is running! Hit CTRL-C to stop it.
2019-03-28T02:12:10.434Z INFO [monitoring] log/log.go:117 Starting metrics logging every 30s
2019-03-28T02:12:11.443Z INFO pipeline/output.go:95 Connecting to redis(tcp://my-redis:6379)
2019-03-28T02:12:11.491Z INFO pipeline/output.go:105 Connection to redis(tcp://my-redis:6379) established
2019-03-28T02:12:11.492Z INFO [input] input/input.go:133 journalbeat successfully published 4 events {"id": "afbf471a-6ceb-469b-bd76-1b3d1eb49a8b"}
2019-03-28T02:12:13.295Z INFO [input] input/input.go:133 journalbeat successfully published 12 events {"id": "afbf471a-6ceb-469b-bd76-1b3d1eb49a8b"}
2019-03-28T02:12:13.419Z INFO [journalbeat] beater/journalbeat.go:129 journalbeat is stopping
The problem only seems to appear when daemonizing (either via daemonize or via systemd):
# daemonize /usr/share/journalbeat/bin/journalbeat -c /etc/journalbeat/journalbeat.yml -path.home /usr/share/journalbeat -path.config /etc/journalbeat -path.data /var/lib/journalbeat -path.logs /var/log/journalbeat -d beat,input
2019-03-28T02:13:59.379Z INFO instance/beat.go:280 Setup Beat: journalbeat; Version: 6.7.0
2019-03-28T02:13:59.379Z DEBUG [beat] instance/beat.go:301 Initializing output plugins
2019-03-28T02:13:59.399Z INFO [publisher] pipeline/module.go:110 Beat name: myhostname
2019-03-28T02:13:59.399Z WARN [cfgwarn] beater/journalbeat.go:50 EXPERIMENTAL: Journalbeat is experimental.
2019-03-28T02:13:59.409Z DEBUG [input] reader/journal.go:87 New local journal is opened for reading {"id": "fa917877-4e80-48fd-931f-d54054c7f61e", "path": "local"}
2019-03-28T02:13:59.460Z DEBUG [input] reader/journal.go:169 Seeked to position defined in cursor {"id": "fa917877-4e80-48fd-931f-d54054c7f61e", "path": "local"}
2019-03-28T02:13:59.460Z DEBUG [input] input/input.go:108 New input is created for paths [] {"id": "fa917877-4e80-48fd-931f-d54054c7f61e"}
2019-03-28T02:13:59.467Z INFO [monitoring] log/log.go:117 Starting metrics logging every 30s
2019-03-28T02:13:59.467Z INFO instance/beat.go:402 journalbeat start running.
2019-03-28T02:13:59.467Z INFO [journalbeat] beater/journalbeat.go:104 journalbeat is running! Hit CTRL-C to stop it.
2019-03-28T02:13:59.827Z INFO pipeline/output.go:95 Connecting to redis(tcp://my-redis:6379)
2019-03-28T02:13:59.908Z INFO pipeline/output.go:105 Connection to redis(tcp://my-redis:6379) established
2019-03-28T02:14:00.113Z INFO [input] input/input.go:133 journalbeat successfully published 2048 events {"id": "fa917877-4e80-48fd-931f-d54054c7f61e"}
2019-03-28T02:14:00.212Z INFO [input] input/input.go:133 journalbeat successfully published 2048 events {"id": "fa917877-4e80-48fd-931f-d54054c7f61e"}
2019-03-28T02:14:00.434Z INFO [input] input/input.go:133 journalbeat successfully published 2048 events {"id": "fa917877-4e80-48fd-931f-d54054c7f61e"}
2019-03-28T02:14:00.617Z ERROR redis/client.go:235 Failed to RPUSH to redis list with: OOM command not allowed when used memory > 'maxmemory'.
2019-03-28T02:14:02.070Z ERROR pipeline/output.go:121 Failed to publish events: OOM command not allowed when used memory > 'maxmemory'.
2019-03-28T02:14:02.070Z INFO pipeline/output.go:95 Connecting to redis(tcp://my-redis:6379)
2019-03-28T02:14:02.091Z INFO pipeline/output.go:105 Connection to redis(tcp://my-redis:6379) established
2019-03-28T02:14:02.216Z ERROR redis/client.go:235 Failed to RPUSH to redis list with: OOM command not allowed when used memory > 'maxmemory'.
2019-03-28T02:14:03.518Z ERROR pipeline/output.go:121 Failed to publish events: OOM command not allowed when used memory > 'maxmemory'.
2019-03-28T02:14:03.518Z INFO pipeline/output.go:95 Connecting to redis(tcp://my-redis:6379)
2019-03-28T02:14:03.570Z INFO pipeline/output.go:105 Connection to redis(tcp://my-redis:6379) established
2019-03-28T02:14:03.686Z INFO [input] input/input.go:133 journalbeat successfully published 4096 events {"id": "fa917877-4e80-48fd-931f-d54054c7f61e"}
2019-03-28T02:14:03.898Z INFO [input] input/input.go:133 journalbeat successfully published 2048 events {"id": "fa917877-4e80-48fd-931f-d54054c7f61e"}
I've played with this a little more over the weekend thinking it might be a config issue and selecting the correct seek option, but cursor seems to be correct for the behaviour we are expecting.
I also confirmed that the systemd unit file is using the interactive command as given above which I wasn't expecting given the difference in behaviour.
I wasn't able to find any documentation on this exactly (and I'm not a Go developer) but the file that looks like it records the last position of where journalbeat got to in the journald logs is kept in /var/lib/journalbeat/registry & this seems to be getting updated when new events are sent as expected.
Also the systemd unit file isn't set-up to accept the reload command (which some of the config docs seem to imply that it should) and indeed sending SIGHUP to the process just restarts it with the same undesirable effect we have seen above.
I'm finding that messages are duplicated at random irrespective of the restart, i.e. even after it's processed all messages from seek or cursor, it duplicates some but not all messages. Please see the discussion https://discuss.elastic.co/t/duplicate-messages-created-by-journalbeat-6-7-1-1/175930
It shows others are seeing the same thing.
I've not opened another bug on this since this MAY or MAY NOT be the same bug/issue.
I examined one server running Fedora and journalbeat 6.7.1 and found the systemd journal file had 534 messages, but my Graylog or elasticsearch cluster had 667 messages or 133 duplicated messages.
Looks like there is a pull request that fixes this: #12479
Most helpful comment
Looks like there is a pull request that fixes this: #12479