Victoriametrics: Create `vmstatsd` app for collecting data in StatsD and DogStatsD formats

Created on 9 Oct 2019  路  6Comments  路  Source: VictoriaMetrics/VictoriaMetrics

Is your feature request related to a problem? Please describe.
StatsD metrics ingestion format is in wide use now. It would be great if VictoriaMetrics could accept data in this format. There are third-party StatsD server implementations exist:

  • The original StatsD server written in node.js. This server can write data directly to VictoriaMetrics via Graphite protocol. However it has the following issues: node.js and slowness.
  • statsd_exporter. Prometheus can scrape data from it and then push the scraped data to VictoriaMetrics via remote_write API. The main issue with this implementation is operation complexity - you need to bring in Prometheus in the architecture.
  • DogStatsD. It supports popular extensions to StatsD such as tags and new data types. But it is unclear whether it can push data to third-party storages such as VictoriaMetrics, or it can send data only to DataDog.

Such a diversity of possible solutions each with its own pros & cons complicates VictoriaMetrics usage for collecting StatsD metrics.

Describe the solution you'd like
It would be great to create a standalone officially supported vmstatsd app for collecting StatsD metrics and pushing them into VictoriaMetrics. The app must conform the following properties:

  • It must be written in Go. This would simplify further maintenance.
  • It must be optimized for speed and low memory usage.
  • It must support input data types and formats from the original StatsD server.
  • It must support DogStatsD extensions such as tags and additional data types (except of strings).
  • It must be standalone, so it could be used with arbitrary time series databases that support Graphite plaintext protocol.
enhancement

Most helpful comment

Update: vmstatsd should be a part of vmagent in the future.

All 6 comments

FYI, this is quite promising project - https://github.com/atlassian/gostatsd . It is written in go and it supports tags

Update: vmstatsd should be a part of vmagent in the future.

Another one, very lite and quite powerful https://github.com/smira/go-statsd

@cristaloleg it is only a client, not a server ...

I don't think statsd_exporter is very complicated, we can use vmagent scrape statsd_exporter. I think the main problem is that its histogram is too hard to use.

Another problem is that the original statsd does not support tagging, and the format for tagging is different in different implementations. I prefer the datadog implementation because they wrote the SDK in more languages !

Based on my conclusion above, I plan to implement statsd in vmagent like this:

  1. support datadog's statsd format
  2. the four metric types of count/gauge/timing/histogram are supported, with the histogram type using the VictoriaMetrics histogram.

We don't need to refer to any more statsd server implementations, just fork a copy of the metrics and change it.

I'm trying to implement it this way, feel free to discuss with me if you have more ideas! https://github.com/faceair/VictoriaMetrics/commits/vmstatsd

At times when statsd traffic is particularly high, we also need a pre-aggregation and routing component to help us forward metric with the same name to the same vmagent instance for final aggregation.
The good news is that it looks like most of the code for this component should be shared with the statsd part of the vmagent.
So can this component be named vmstatsd? This should be a separate process that is deployed before vmagent.
This approach increases the complexity of the architecture.

Alternatively, we can add a label to the data of multiple vmagents, similar to the instance label. As with the current use of the prometheus metric, the query would aggregate the counter metric of multiple vmagents in the promql query engine.
This approach increases the number of metric writes by a factor of several, and increases the query overhead.

Was this page helpful?
0 / 5 - 0 ratings