I would be nice to have ability to stream metrics to Datadog.
In conjunction with log_to_metric transformation, it would make it possible to, for example, ingest a single data stream, then simultaneously store it to S3 and report number of ingested lines to DataDog as a metric.
@loony-bean I've assigned this to you. I think this would be a good sink to add now that the metrics data model is mostly finalized. What do you think?
One thing worth discussing -- Datadog accepts logs in addition to metrics. To start, we should only support the metrics use case but I would like to design it in such a way that it would forward log events as well. Do you see any reason a single datadog sink couldn't handle both? I'm mostly thinking about configuration and data flow. Ex: it might be worth having a datadog_metrics and datadog_logs sink? This would follow the same pattern we set for the aws_cloudwatch_logs and aws_cloudwatch_metrics sinks.
I need to read the docs more thoroughly to understand possible corner cases (for example, DD is not supporting relative Gauges and native Statsd Timers), but in general I think we are good to go. We can start from simply sending UDP datagrams to DD Agent.
I believe datadog_metrics is a fine name, but adding this will probably require us to rename our statsd source for consistency.
In the meantime it is possible to send metrics to Datadog using stateful javascript transform (#721) combined with http sink.
Example:
vector.toml
[sources.events-stream]
type = "stdin"
[transforms.datadog-batch-metrics]
type = "javascript"
inputs = ["events-stream"]
path = "datadog.js"
handler = "handler"
[sinks.datadog-send-metrics]
type = "http"
inputs = ["datadog-batch-metrics"]
encoding = "text"
compression = "gzip"
uri = "https://api.datadoghq.com/api/v1/series?api_key=${DATADOG_API_KEY}"
datadog.js
const METRIC_NAME = 'vector.metric.test'
const MAX_TIMEOUT_MS = 5000
const MAX_POINTS = 100
const points = new Map()
let prevTs = Date.now()
const handler = event => {
// increment counter corresponding to the given second
const key = Math.floor(event.timestamp / 1000)
points.set(key, (points.get(key) || 0) + 1)
const ts = Date.now()
if (ts - prevTs >= MAX_TIMEOUT_MS || points.size >= MAX_POINTS) {
// prepare request payload
// see https://docs.datadoghq.com/api/?lang=bash#post-timeseries-points
const payload = {
series: [{
points: Array.from(points.entries()),
metric: METRIC_NAME,
type: 'count',
host: event.host // for simplicity assume that all events have the same host
}]
}
points.clear()
prevTs = ts
return {
message: JSON.stringify(payload)
}
} else {
return null
}
}
and the environment variable DATADOG_API_KEY should contain the corresponding API key.
An issue with this approach is that metrics for events from the last batch would not be sent.
Most helpful comment
@loony-bean I've assigned this to you. I think this would be a good sink to add now that the metrics data model is mostly finalized. What do you think?
One thing worth discussing -- Datadog accepts logs in addition to metrics. To start, we should only support the metrics use case but I would like to design it in such a way that it would forward
logevents as well. Do you see any reason a singledatadogsink couldn't handle both? I'm mostly thinking about configuration and data flow. Ex: it might be worth having adatadog_metricsanddatadog_logssink? This would follow the same pattern we set for theaws_cloudwatch_logsandaws_cloudwatch_metricssinks.