Beats: Journalbeat sends duplicate log entries on restart

Created on 28 Mar 2019  路  3Comments  路  Source: elastic/beats

  • Version: 6.7.0
  • Operating System: CentOS 7.6.1810
  • Config:
journalbeat.inputs:
  - paths: []
    seek: cursor

output.redis:
  enabled: true
  hosts:
    - my-redis:6379
  ssl.enabled: false

fields:
  product: foo
  ecosystem: bar
  env: baz
  role: mysystem

It seems that when journalbeat is restarted instead of picking up where it left off it goes back much further (by 10's of thousands of events) and therefore you get a bit of a spike in traffic as well as MANY duplicated logs. If using Redis like we are this can also overwhelm your Redis box.

Reproduce:

  1. Restart journalbeat.

If started interactively it seems to work OK:

# /usr/share/journalbeat/bin/journalbeat -c /etc/journalbeat/journalbeat.yml -path.home /usr/share/journalbeat -path.config /etc/journalbeat -path.data /var/lib/journalbeat -path.logs /var/log/journalbeat -d beat,input
2019-03-28T02:12:10.381Z    INFO    instance/beat.go:280    Setup Beat: journalbeat; Version: 6.7.0
2019-03-28T02:12:10.381Z    DEBUG    [beat]    instance/beat.go:301    Initializing output plugins
2019-03-28T02:12:10.394Z    INFO    [publisher]    pipeline/module.go:110    Beat name: myhostname
2019-03-28T02:12:10.394Z    WARN    [cfgwarn]    beater/journalbeat.go:50    EXPERIMENTAL: Journalbeat is experimental.
2019-03-28T02:12:10.418Z    DEBUG    [input]    reader/journal.go:87    New local journal is opened for reading    {"id": "afbf471a-6ceb-469b-bd76-1b3d1eb49a8b", "path": "local"}
2019-03-28T02:12:10.418Z    DEBUG    [input]    reader/journal.go:169    Seeked to position defined in cursor    {"id": "afbf471a-6ceb-469b-bd76-1b3d1eb49a8b", "path": "local"}
2019-03-28T02:12:10.418Z    DEBUG    [input]    input/input.go:108    New input is created for paths []    {"id": "afbf471a-6ceb-469b-bd76-1b3d1eb49a8b"}
2019-03-28T02:12:10.419Z    INFO    instance/beat.go:402    journalbeat start running.
2019-03-28T02:12:10.419Z    INFO    [journalbeat]    beater/journalbeat.go:104    journalbeat is running! Hit CTRL-C to stop it.
2019-03-28T02:12:10.434Z    INFO    [monitoring]    log/log.go:117    Starting metrics logging every 30s
2019-03-28T02:12:11.443Z    INFO    pipeline/output.go:95    Connecting to redis(tcp://my-redis:6379)
2019-03-28T02:12:11.491Z    INFO    pipeline/output.go:105    Connection to redis(tcp://my-redis:6379) established
2019-03-28T02:12:11.492Z    INFO    [input]    input/input.go:133    journalbeat successfully published 4 events    {"id": "afbf471a-6ceb-469b-bd76-1b3d1eb49a8b"}
2019-03-28T02:12:13.295Z    INFO    [input]    input/input.go:133    journalbeat successfully published 12 events    {"id": "afbf471a-6ceb-469b-bd76-1b3d1eb49a8b"}
2019-03-28T02:12:13.419Z    INFO    [journalbeat]    beater/journalbeat.go:129    journalbeat is stopping

The problem only seems to appear when daemonizing (either via daemonize or via systemd):

# daemonize /usr/share/journalbeat/bin/journalbeat -c /etc/journalbeat/journalbeat.yml -path.home /usr/share/journalbeat -path.config /etc/journalbeat -path.data /var/lib/journalbeat -path.logs /var/log/journalbeat -d beat,input
2019-03-28T02:13:59.379Z    INFO    instance/beat.go:280    Setup Beat: journalbeat; Version: 6.7.0
2019-03-28T02:13:59.379Z    DEBUG    [beat]    instance/beat.go:301    Initializing output plugins
2019-03-28T02:13:59.399Z    INFO    [publisher]    pipeline/module.go:110    Beat name: myhostname
2019-03-28T02:13:59.399Z    WARN    [cfgwarn]    beater/journalbeat.go:50    EXPERIMENTAL: Journalbeat is experimental.
2019-03-28T02:13:59.409Z    DEBUG    [input]    reader/journal.go:87    New local journal is opened for reading    {"id": "fa917877-4e80-48fd-931f-d54054c7f61e", "path": "local"}
2019-03-28T02:13:59.460Z    DEBUG    [input]    reader/journal.go:169    Seeked to position defined in cursor    {"id": "fa917877-4e80-48fd-931f-d54054c7f61e", "path": "local"}
2019-03-28T02:13:59.460Z    DEBUG    [input]    input/input.go:108    New input is created for paths []    {"id": "fa917877-4e80-48fd-931f-d54054c7f61e"}
2019-03-28T02:13:59.467Z    INFO    [monitoring]    log/log.go:117    Starting metrics logging every 30s
2019-03-28T02:13:59.467Z    INFO    instance/beat.go:402    journalbeat start running.
2019-03-28T02:13:59.467Z    INFO    [journalbeat]    beater/journalbeat.go:104    journalbeat is running! Hit CTRL-C to stop it.
2019-03-28T02:13:59.827Z    INFO    pipeline/output.go:95    Connecting to redis(tcp://my-redis:6379)
2019-03-28T02:13:59.908Z    INFO    pipeline/output.go:105    Connection to redis(tcp://my-redis:6379) established
2019-03-28T02:14:00.113Z    INFO    [input]    input/input.go:133    journalbeat successfully published 2048 events    {"id": "fa917877-4e80-48fd-931f-d54054c7f61e"}
2019-03-28T02:14:00.212Z    INFO    [input]    input/input.go:133    journalbeat successfully published 2048 events    {"id": "fa917877-4e80-48fd-931f-d54054c7f61e"}
2019-03-28T02:14:00.434Z    INFO    [input]    input/input.go:133    journalbeat successfully published 2048 events    {"id": "fa917877-4e80-48fd-931f-d54054c7f61e"}
2019-03-28T02:14:00.617Z    ERROR    redis/client.go:235    Failed to RPUSH to redis list with: OOM command not allowed when used memory > 'maxmemory'.
2019-03-28T02:14:02.070Z    ERROR    pipeline/output.go:121    Failed to publish events: OOM command not allowed when used memory > 'maxmemory'.
2019-03-28T02:14:02.070Z    INFO    pipeline/output.go:95    Connecting to redis(tcp://my-redis:6379)
2019-03-28T02:14:02.091Z    INFO    pipeline/output.go:105    Connection to redis(tcp://my-redis:6379) established
2019-03-28T02:14:02.216Z    ERROR    redis/client.go:235    Failed to RPUSH to redis list with: OOM command not allowed when used memory > 'maxmemory'.
2019-03-28T02:14:03.518Z    ERROR    pipeline/output.go:121    Failed to publish events: OOM command not allowed when used memory > 'maxmemory'.
2019-03-28T02:14:03.518Z    INFO    pipeline/output.go:95    Connecting to redis(tcp://my-redis:6379)
2019-03-28T02:14:03.570Z    INFO    pipeline/output.go:105    Connection to redis(tcp://my-redis:6379) established
2019-03-28T02:14:03.686Z    INFO    [input]    input/input.go:133    journalbeat successfully published 4096 events    {"id": "fa917877-4e80-48fd-931f-d54054c7f61e"}
2019-03-28T02:14:03.898Z    INFO    [input]    input/input.go:133    journalbeat successfully published 2048 events    {"id": "fa917877-4e80-48fd-931f-d54054c7f61e"}
Journalbeat bug

Most helpful comment

Looks like there is a pull request that fixes this: #12479

All 3 comments

I've played with this a little more over the weekend thinking it might be a config issue and selecting the correct seek option, but cursor seems to be correct for the behaviour we are expecting.

I also confirmed that the systemd unit file is using the interactive command as given above which I wasn't expecting given the difference in behaviour.

I wasn't able to find any documentation on this exactly (and I'm not a Go developer) but the file that looks like it records the last position of where journalbeat got to in the journald logs is kept in /var/lib/journalbeat/registry & this seems to be getting updated when new events are sent as expected.

Also the systemd unit file isn't set-up to accept the reload command (which some of the config docs seem to imply that it should) and indeed sending SIGHUP to the process just restarts it with the same undesirable effect we have seen above.

I'm finding that messages are duplicated at random irrespective of the restart, i.e. even after it's processed all messages from seek or cursor, it duplicates some but not all messages. Please see the discussion https://discuss.elastic.co/t/duplicate-messages-created-by-journalbeat-6-7-1-1/175930

It shows others are seeing the same thing.

I've not opened another bug on this since this MAY or MAY NOT be the same bug/issue.

I examined one server running Fedora and journalbeat 6.7.1 and found the systemd journal file had 534 messages, but my Graylog or elasticsearch cluster had 667 messages or 133 duplicated messages.

Looks like there is a pull request that fixes this: #12479

Was this page helpful?
0 / 5 - 0 ratings