Vector: New `datadog_metrics` sink

Created on 4 Jul 2019  路  3Comments  路  Source: timberio/vector

I would be nice to have ability to stream metrics to Datadog.

In conjunction with log_to_metric transformation, it would make it possible to, for example, ingest a single data stream, then simultaneously store it to S3 and report number of ingested lines to DataDog as a metric.

data model metrics feature

Most helpful comment

@loony-bean I've assigned this to you. I think this would be a good sink to add now that the metrics data model is mostly finalized. What do you think?

One thing worth discussing -- Datadog accepts logs in addition to metrics. To start, we should only support the metrics use case but I would like to design it in such a way that it would forward log events as well. Do you see any reason a single datadog sink couldn't handle both? I'm mostly thinking about configuration and data flow. Ex: it might be worth having a datadog_metrics and datadog_logs sink? This would follow the same pattern we set for the aws_cloudwatch_logs and aws_cloudwatch_metrics sinks.

All 3 comments

@loony-bean I've assigned this to you. I think this would be a good sink to add now that the metrics data model is mostly finalized. What do you think?

One thing worth discussing -- Datadog accepts logs in addition to metrics. To start, we should only support the metrics use case but I would like to design it in such a way that it would forward log events as well. Do you see any reason a single datadog sink couldn't handle both? I'm mostly thinking about configuration and data flow. Ex: it might be worth having a datadog_metrics and datadog_logs sink? This would follow the same pattern we set for the aws_cloudwatch_logs and aws_cloudwatch_metrics sinks.

I need to read the docs more thoroughly to understand possible corner cases (for example, DD is not supporting relative Gauges and native Statsd Timers), but in general I think we are good to go. We can start from simply sending UDP datagrams to DD Agent.

I believe datadog_metrics is a fine name, but adding this will probably require us to rename our statsd source for consistency.

In the meantime it is possible to send metrics to Datadog using stateful javascript transform (#721) combined with http sink.

Example:

vector.toml

[sources.events-stream]
  type = "stdin"

[transforms.datadog-batch-metrics]
  type = "javascript"
  inputs = ["events-stream"]
  path = "datadog.js"
  handler = "handler"

[sinks.datadog-send-metrics]
  type = "http"
  inputs = ["datadog-batch-metrics"]
  encoding = "text"
  compression = "gzip"
  uri = "https://api.datadoghq.com/api/v1/series?api_key=${DATADOG_API_KEY}"

datadog.js

const METRIC_NAME = 'vector.metric.test'
const MAX_TIMEOUT_MS = 5000
const MAX_POINTS = 100

const points = new Map()
let prevTs = Date.now()

const handler = event => {
  // increment counter corresponding to the given second
  const key = Math.floor(event.timestamp / 1000)
  points.set(key, (points.get(key) || 0) + 1)

  const ts = Date.now()
  if (ts - prevTs >= MAX_TIMEOUT_MS || points.size >= MAX_POINTS) {
    // prepare request payload
    // see https://docs.datadoghq.com/api/?lang=bash#post-timeseries-points
    const payload = {
      series: [{
        points: Array.from(points.entries()),
        metric: METRIC_NAME,
        type: 'count',
        host: event.host // for simplicity assume that all events have the same host
      }]
    }
    points.clear()
    prevTs = ts
    return {
      message: JSON.stringify(payload)
    }
  } else {
    return null
  }
}

and the environment variable DATADOG_API_KEY should contain the corresponding API key.

An issue with this approach is that metrics for events from the last batch would not be sent.

Was this page helpful?
0 / 5 - 0 ratings