Beats: Auditbeat 7.7.x Poor Performance: 100%+ CPU Usage with System Module Socket Dataset Enabled

Created on 11 Jun 2020  Â·  16Comments  Â·  Source: elastic/beats

Auditbeat 7.7.x with the System Module Socket Dataset enabled, will randomly start using 100%+ CPU on some servers. This was not an issue prior to 7.7.x.

Restarting the Auditbeat services causes CPU usage to go back to normal for a bit, but it will eventually start having issues again.

This issue doesn't seem to happen on every server, running Auditbeat on ~100 servers with the same config (below), the issue appears to occur on 10-15% of the servers. I see the issue on both OpenSUSE and CentOS servers, on multiple different kernels, and running different apps.

Screenshot showing issue (Percentages on the graph are of total CPU, not of individual cores, this example server has 4 cores, meaning Auditbeat is using one of them completely for itself):

image

Version Output:

auditbeat version
auditbeat version 7.7.1 (amd64), libbeat 7.7.1 [932b273e8940575e15f10390882be205bad29e1f built 2020-05-28 15:20:33 +0000 UTC]

System versions:

# uname -a
Linux server 3.10.0-1062.9.1.el7.x86_64 #1 SMP Fri Dec 6 15:49:49 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux

# cat /etc/os-release
NAME="CentOS Linux"
VERSION="7 (Core)"
ID="centos"
ID_LIKE="rhel fedora"
VERSION_ID="7"
PRETTY_NAME="CentOS Linux 7 (Core)"
ANSI_COLOR="0;31"
CPE_NAME="cpe:/o:centos:centos:7"
HOME_URL="https://www.centos.org/"
BUG_REPORT_URL="https://bugs.centos.org/"

CENTOS_MANTISBT_PROJECT="CentOS-7"
CENTOS_MANTISBT_PROJECT_VERSION="7"
REDHAT_SUPPORT_PRODUCT="centos"
REDHAT_SUPPORT_PRODUCT_VERSION="7"

Configuration:

###################### Auditbeat Configuration #########################

#==========================  Modules configuration =============================
auditbeat.modules:

- module: auditd
  resolve_ids: true
  failure_mode: silent
  backlog_limit: 8192
  rate_limit: 0
  include_raw_message: false
  include_warnings: false
  backpressure_stratgey: auto
  # Load audit rules from separate files. Same format as audit.rules(7).
  audit_rule_files: [ '${path.config}/audit.rules.d/*.conf' ]
  audit_rules: |
    ## Define audit rules here.
    ## Create file watches (-w) or syscall audits (-a or -A). Uncomment these
    ## examples or add your own rules.

    ## If you are on a 64 bit platform, everything should be running
    ## in 64 bit mode. This rule will detect any use of the 32 bit syscalls
    ## because this might be a sign of someone exploiting a hole in the 32
    ## bit API.
    -a always,exit -F arch=b32 -S all -F key=32bit-abi

    ## Executions.
    -a always,exit -F arch=b64 -S execve,execveat -k exec

    ## External access (warning: these can be expensive to audit).
    -a always,exit -F arch=b64 -S accept,bind,connect -F key=external-access

    ## Identity changes.
    -w /etc/group -p wa -k identity
    -w /etc/passwd -p wa -k identity
    -w /etc/gshadow -p wa -k identity
    -w /etc/shadow -p wa -k identity

    ## Unauthorized access attempts.
    -a always,exit -F arch=b32 -S open,creat,truncate,ftruncate,openat,open_by_handle_at -F exit=-EACCES -k access
    -a always,exit -F arch=b32 -S open,creat,truncate,ftruncate,openat,open_by_handle_at -F exit=-EPERM -k access
    -a always,exit -F arch=b64 -S open,creat,truncate,ftruncate,openat,open_by_handle_at -F exit=-EACCES -k access
    -a always,exit -F arch=b64 -S open,creat,truncate,ftruncate,openat,open_by_handle_at -F exit=-EPERM -k access

- module: file_integrity
  paths:
  - /bin
  - /usr/bin
  - /sbin
  - /usr/sbin
  - /etc
  - /root
  - /usr/local/bin
  - /home
  exclude_files:
  - '(?i)\.sw[nop]$'
  - '~$'
  - '/\.git($|/)'
  - '\.rrd$'
  include_files: []
  scan_at_start: true
  scan_rate_per_sec: 50 MiB
  max_file_size: 100 MiB
  hash_types: [md5,sha256]
  recursive: true

- module: system
  datasets:
    - host    # General host information, e.g. uptime, IPs
    - login   # User logins, logouts, and system boots.
    - package # Installed, updated, and removed packages
    - process # Started and stopped processes
    - socket  # Opened and closed sockets
    - user    # User information

  # How often datasets send state updates with the
  # current state of the system (e.g. all currently
  # running processes, all open sockets).
  state.period: 12h

  # Enabled by default. Auditbeat will read password fields in
  # /etc/passwd and /etc/shadow and store a hash locally to
  # detect any changes.
  user.detect_password_changes: true

  # File patterns of the login record files.
  login.wtmp_file_pattern: /var/log/wtmp*
  login.btmp_file_pattern: /var/log/btmp*

#================================ Outputs =====================================

#----------------------------- Logstash output --------------------------------
output.logstash:
  # The Logstash hosts
  hosts: ["<snipped>"]
  loadbalance: true

#================================ Processors =====================================

processors:
  - add_host_metadata: ~
  - add_tags:
      tags: [auditbeat]
  - dns:
      type: reverse
      fields:
        server.ip: server.hostname
        client.ip: client.hostname
        source.ip: source.hostname
        destination.ip: destination.hostname
      nameservers: ['<snipped>']
      tag_on_failure: [_dns_reverse_lookup_failed]

#================================ Logging =====================================

logging.level: info
logging.to_files: true
logging.files:
  path: /var/log/auditbeat
  name: auditbeat
  keepfiles: 2
  permissions: 0600
  rotateeverybytes: 5242880

#============================== X-Pack Monitoring ===============================
monitoring.enabled: true
monitoring.elasticsearch:
  hosts: ["<snipped>"]
  protocol: "https"
  username: "<snipped>"
  password: "<snipped>"
  ssl.enabled: true
  ssl.verification_mode: full
  ssl.certificate_authorities: ["<snipped>"]
monitoring.cluster_uuid: "<snipped>"

For confirmed bugs, please report:

  • Version: 7.7.0 + 7.7.1
  • Operating System: OpenSUSE 15.0, OpenSUSE 15.1, CentOS 7, CentOS 8
  • Discuss Forum URL: https://discuss.elastic.co/t/auditbeat-120-cpu/234909
  • Steps to Reproduce:
  • Install Auditbeat 7.7.x (I used the RPM file and installed via RPM)
  • Configure Auditbeat to use System module with Socket portion
  • Start Auditbeat
  • Wait for Auditbeat to start consuming more CPU than it should
  • Stop Auditbeat
  • Remove Socket portion of System module
  • Start Auditbeat
  • Auditbeat will no longer use more CPU than it should
Auditbeat SIEM bug

Most helpful comment

Hello
I had not seen the information in the 7.8.1 release
I installed 7.8.1 on my servers and have no more CPU issues
Thank you

All 16 comments

Pinging @elastic/siem (Team:SIEM)

I'm running into the exact same issue too with Auditbeat 7.7.1 - running on Ubuntu 16.04.

It looks like you're running into the issue fixed by https://github.com/elastic/beats/pull/19033.

The fix was too late for 7.7.1, but it will make it into 7.8.0.

Fix available in 7.8.0

I have that version installed and I am still seeing this problem:

$ auditbeat version
auditbeat version 7.8.0 (amd64), libbeat 7.8.0 [f79387d32717d79f689d94fda1ec80b2cf285d30 built 2020-06-14 18:11:10 +0000 UTC]

According to perf top, this is where the CPU time goes:

  42,62%  auditbeat [.] runtime.mapaccess2_fast64
  15,19%  auditbeat [.] github.com/elastic/beats/v7/x-pack/auditbeat/module/system/socket.(*state).ExpireOlder
  10,50%  auditbeat [.] runtime.aeshash64
   7,76%  auditbeat [.] github.com/elastic/beats/v7/x-pack/auditbeat/module/system/socket.(*state).onSockDestroyed
   3,57%  auditbeat [.] time.Time.Before
   2,92%  auditbeat [.] github.com/elastic/beats/v7/x-pack/auditbeat/module/system/socket.(*socket).Timestamp

As this call stack suggests, removing the socket dataset from the system module makes this problem go away:

--- auditbeat.yml.cpuhog    2020-06-23 09:22:49.122378568 +0200
+++ auditbeat.yml   2020-06-23 09:22:58.938317272 +0200
@@ -59,7 +59,7 @@
     - host    # General host information, e.g. uptime, IPs
     - login   # User logins, logouts, and system boots.
     - process # Started and stopped processes
-    - socket  # Opened and closed sockets
+    # - socket  # Opened and closed sockets
     - user    # User information

   # How often datasets send state updates with the

Fix available in 7.8.0

I have upgraded a client to auditbeat 7.8.0 and am still experiencing the same issue (on Ubuntu 16.04.6 LTS). One client upgraded from 7.6.1 (without the socket issue) to 7.8.0 and is now getting high CPU usage. Still the workaround is to uncomment the socket dataset.

@adriansr could this issue be reopened as the issue does not appear to be fixed in 7.8.0?

Reopening.

Can someone please provide the output of running Auditbeat with -httpprof :8080 and once it's using 100% cpu, run curl 'http://localhost:8080/debug/pprof/profile?seconds=30' -o profile.prof and share the profile.prof binary file.

@adriansr Here are 3 servers with the issue. Attached zip file contains the 3 profiles:

Server A:

#uname -a
Linux assetmgmt01 4.12.14-lp150.12.82-default #1 SMP Tue Nov 12 16:32:38 UTC 2019 (c939e24) x86_64 x86_64 x86_64 GNU/Linux

#cat /etc/os-release
NAME="openSUSE Leap"
VERSION="15.0"
ID="opensuse-leap"
ID_LIKE="suse opensuse"
VERSION_ID="15.0"
PRETTY_NAME="openSUSE Leap 15.0"
ANSI_COLOR="0;32"
CPE_NAME="cpe:/o:opensuse:leap:15.0"
BUG_REPORT_URL="https://bugs.opensuse.org"
HOME_URL="https://www.opensuse.org/"

Server B:

#uname -a
Linux dmiml01-stg 4.12.14-lp150.12.82-default #1 SMP Tue Nov 12 16:32:38 UTC 2019 (c939e24) x86_64 x86_64 x86_64 GNU/Linux

#cat /etc/os-release
NAME="openSUSE Leap"
VERSION="15.0"
ID="opensuse-leap"
ID_LIKE="suse opensuse"
VERSION_ID="15.0"
PRETTY_NAME="openSUSE Leap 15.0"
ANSI_COLOR="0;32"
CPE_NAME="cpe:/o:opensuse:leap:15.0"
BUG_REPORT_URL="https://bugs.opensuse.org"
HOME_URL="https://www.opensuse.org/"

Server C:

#uname -a
Linux dnsdist 4.18.0-147.5.1.el8_1.x86_64 #1 SMP Wed Feb 5 02:00:39 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux

#cat /etc/os-release
NAME="CentOS Linux"
VERSION="8 (Core)"
ID="centos"
ID_LIKE="rhel fedora"
VERSION_ID="8"
PLATFORM_ID="platform:el8"
PRETTY_NAME="CentOS Linux 8 (Core)"
ANSI_COLOR="0;31"
CPE_NAME="cpe:/o:centos:centos:8"
HOME_URL="https://www.centos.org/"
BUG_REPORT_URL="https://bugs.centos.org/"

CENTOS_MANTISBT_PROJECT="CentOS-8"
CENTOS_MANTISBT_PROJECT_VERSION="8"
REDHAT_SUPPORT_PRODUCT="centos"
REDHAT_SUPPORT_PRODUCT_VERSION="8"

auditbeat_profiles.zip

Hello
7.8 dont fix this issue for me
I take this opportunity to say that commenting on "socket" reduced the CPU, but after a while the CPU increased again
Whereas with 7.5 auditbeat was completely transparent on my servers
Now It is only on servers where there is Apache that it does not work properly
I'm still using the original configuration file

image

Thanks

same.

@wixaw & @vinnytroia what versions of auditbeat are you running? The fix for the bug I found was shipped in 7.8.1 which was released on July 27th--trying to determine if this is another issue or if you just need to upgrade the patch version.

Oh. I don’t have 781. Let me try. I will get back thanks

Vinny Troia
www.nightlion.com
www.vinnytroia.com


From: Andrew Stucki notifications@github.com
Sent: Tuesday, August 4, 2020 10:31:40 AM
To: elastic/beats beats@noreply.github.com
Cc: Vinny Troia vinny@nightlionsecurity.com; Mention mention@noreply.github.com
Subject: Re: [elastic/beats] Auditbeat 7.7.x Poor Performance: 100%+ CPU Usage with System Module Socket Dataset Enabled (#19141)

@wixawhttps://github.com/wixaw & @vinnytroiahttps://github.com/vinnytroia what versions of auditbeat are you running? The fix for the bug I found was shipped in 7.8.1 which was release on July 27th--trying to determine if this is another issue or if you just need to upgrade the patch version.

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHubhttps://github.com/elastic/beats/issues/19141#issuecomment-668666465, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AAMY7T5SYXWOGKUQV6G7J3LR7ASVZANCNFSM4N3WETXA.

Hello
I had not seen the information in the 7.8.1 release
I installed 7.8.1 on my servers and have no more CPU issues
Thank you

I still see this problem in version 7.9.3

I still have the problem (with version 7.9.1) on machines with a lot of network traffic (e.g squid, webserver), too.

Was this page helpful?
0 / 5 - 0 ratings