Datadog-agent: ubuntu arm64 agent reporting SUM system.net.bytes_sent and bytes_received

Created on 28 Dec 2020  ·  8Comments  ·  Source: DataDog/datadog-agent

Output of the info page (if this is a bug)

● datadog-agent.service - Datadog Agent
     Loaded: loaded (/lib/systemd/system/datadog-agent.service; enabled; vendor preset: enabled)
     Active: active (running) since Mon 2020-12-28 16:38:29 EST; 1min 1s ago
   Main PID: 41518 (agent)
      Tasks: 11 (limit: 2102)
     CGroup: /system.slice/datadog-agent.service
             └─41518 /opt/datadog-agent/bin/agent/agent run -p /opt/datadog-agent/run/agent.pid

Dec 28 16:39:21 --masked.domain-- agent[41518]: 2020-12-28 16:39:21 EST | CORE | INFO | (pkg/collector/runner/runner.go:261 in work) | check:disk | Running check
Dec 28 16:39:21 --masked.domain-- agent[41518]: 2020-12-28 16:39:21 EST | CORE | INFO | (pkg/collector/runner/runner.go:327 in work) | check:disk | Done running check
Dec 28 16:39:25 --masked.domain-- agent[41518]: 2020-12-28 16:39:25 EST | CORE | INFO | (pkg/collector/runner/runner.go:261 in work) | check:memory | Running check
Dec 28 16:39:25 --masked.domain-- agent[41518]: 2020-12-28 16:39:25 EST | CORE | INFO | (pkg/collector/runner/runner.go:327 in work) | check:memory | Done running check
Dec 28 16:39:26 --masked.domain-- agent[41518]: 2020-12-28 16:39:26 EST | CORE | INFO | (pkg/collector/runner/runner.go:261 in work) | check:io | Running check
Dec 28 16:39:26 --masked.domain-- agent[41518]: 2020-12-28 16:39:26 EST | CORE | INFO | (pkg/collector/runner/runner.go:327 in work) | check:io | Done running check
Dec 28 16:39:27 --masked.domain-- agent[41518]: 2020-12-28 16:39:27 EST | CORE | INFO | (pkg/collector/runner/runner.go:261 in work) | check:cpu | Running check
Dec 28 16:39:27 --masked.domain-- agent[41518]: 2020-12-28 16:39:27 EST | CORE | INFO | (pkg/collector/runner/runner.go:327 in work) | check:cpu | Done running check
Dec 28 16:39:28 --masked.domain-- agent[41518]: 2020-12-28 16:39:28 EST | CORE | INFO | (pkg/collector/runner/runner.go:261 in work) | check:network | Running check
Dec 28 16:39:28 --masked.domain-- agent[41518]: 2020-12-28 16:39:28 EST | CORE | INFO | (pkg/collector/runner/runner.go:327 in work) | check:network | Done running check

Describe what happened:
I have a monitor set up to trigger when system.net.bytes_sent is over 500000 within five minutes. However, since installing the agent on my raspberry pi running Ubuntu, the network has steadily gone up.

Network traffic (bytes per sec)

The scale in the image is MiB/s.

Describe what you expected:
I think what's going on is that the agent is summing the bytes, instead of reporting the average. I would expect there to be more ups and downs but the traffic continues to increase. Restarting the pi does not reset the number.

Steps to reproduce the issue:

  • Edit the /etc/datadog/datadog.yml file and force a new hostname
  • View the host's dashboard in datadog
  • Watch bytes_received slowly go higher and higher.

Additional environment details (Operating System, Cloud provider, etc):

  • Raspberry Pi 4
  • Ubuntu 64-bit

EDIT ~24 hours later: The Raspberry Pi has a 100MiB/s network port. The total "sent" is well over that at 129MiB/s.
Network traffic (bytes per sec)

kinbug teaagent-core teaagent-platform

Most helpful comment

Hey @sohmc and @thebarbershop,

Thanks for the report! There indeed is an issue with metrics in the ARM64 Agent since 6.24.0 and 7.24.0, where non-gauge metrics are processed as gauge metrics. This is especially visible for rate metrics, as the "sum" is sent instead of diffs.

There is a PR on the way to fix this: https://github.com/DataDog/datadog-agent/pull/7125, and the current plan is to get it included in Agent 6.25.1 and 7.25.1. In the meantime, I recommend downgrading to a previous version of the Agent (6.23.1 and 7.23.1 are the latest unaffected versions).

All 8 comments

Looks like I've reached the limit of the metric:

  • 538.97 MiB/s received
  • 1.1 GiB/s Sent

Network traffic (bytes per sec)

Scale in this image is GiB/s

I can reproduce this with datadog-agent 6.24.1, Ubuntu 20.04, ARM64 but it doesn't happen if I change the CPU architecture to x86_64.

Hey @sohmc and @thebarbershop,

Thanks for the report! There indeed is an issue with metrics in the ARM64 Agent since 6.24.0 and 7.24.0, where non-gauge metrics are processed as gauge metrics. This is especially visible for rate metrics, as the "sum" is sent instead of diffs.

There is a PR on the way to fix this: https://github.com/DataDog/datadog-agent/pull/7125, and the current plan is to get it included in Agent 6.25.1 and 7.25.1. In the meantime, I recommend downgrading to a previous version of the Agent (6.23.1 and 7.23.1 are the latest unaffected versions).

Hey @sohmc and @thebarbershop,

Thanks for the report! There indeed is an issue with metrics in the ARM64 Agent since 6.24.0 and 7.24.0, where non-gauge metrics are processed as gauge metrics. This is especially visible for rate metrics, as the "sum" is sent instead of diffs.

There is a PR on the way to fix this: #7125, and the current plan is to get it included in Agent 6.25.1 and 7.25.1. In the meantime, I recommend downgrading to a previous version of the Agent (6.23.1 and 7.23.1 are the latest unaffected versions).

How can I install the specific version? I tried the below command but 6.24.1 was installed.

DD_API_KEY=XXXXXXXXXXXX bash -c "$(curl -kL https://raw.githubusercontent.com/DataDog/datadog-agent/6.23.1/cmd/agent/install_script.sh)"

I installed the datadog-agent 6.23.1 as below. Is there the best installation?

DD_AGENT_FLAVOR="datadog-agent=1:6.23.1-1" DD_AGENT_MAJOR_VERSION=6 DD_API_KEY=$DD_API_KEY DD_SITE="datadoghq.com" bash -c "$(curl -L https://s3.amazonaws.com/dd-agent/scripts/install_script.sh)"

If you already have the datadog-agent package installed on your host, the recommended way to downgrade it from the CLI is described in our public documentation: https://docs.datadoghq.com/agent/faq/agent-downgrade-minor/#cli, in this case:

sudo apt-get update && sudo apt-get install --allow-downgrades datadog-agent=1:6.23.1-1

Darn, here I was hoping that I'd pass enough data to exceed the speed of light!

I look forward to the update!

Hi all, the latest version of the Datadog Agent (7.25.1) includes a fix for this issue. You can update the datadog-agent package to get this version. I am closing this issue given the fix but please let us know if you encounter it again

Was this page helpful?
0 / 5 - 0 ratings