Is your feature request related to a problem? Please describe.
StatsD metrics ingestion format is in wide use now. It would be great if VictoriaMetrics could accept data in this format. There are third-party StatsD server implementations exist:
node.js and slowness.remote_write API. The main issue with this implementation is operation complexity - you need to bring in Prometheus in the architecture.Such a diversity of possible solutions each with its own pros & cons complicates VictoriaMetrics usage for collecting StatsD metrics.
Describe the solution you'd like
It would be great to create a standalone officially supported vmstatsd app for collecting StatsD metrics and pushing them into VictoriaMetrics. The app must conform the following properties:
FYI, this is quite promising project - https://github.com/atlassian/gostatsd . It is written in go and it supports tags
Update: vmstatsd should be a part of vmagent in the future.
Another one, very lite and quite powerful https://github.com/smira/go-statsd
@cristaloleg it is only a client, not a server ...
I don't think statsd_exporter is very complicated, we can use vmagent scrape statsd_exporter. I think the main problem is that its histogram is too hard to use.
Another problem is that the original statsd does not support tagging, and the format for tagging is different in different implementations. I prefer the datadog implementation because they wrote the SDK in more languages !
Based on my conclusion above, I plan to implement statsd in vmagent like this:
We don't need to refer to any more statsd server implementations, just fork a copy of the metrics and change it.
I'm trying to implement it this way, feel free to discuss with me if you have more ideas! https://github.com/faceair/VictoriaMetrics/commits/vmstatsd
At times when statsd traffic is particularly high, we also need a pre-aggregation and routing component to help us forward metric with the same name to the same vmagent instance for final aggregation.
The good news is that it looks like most of the code for this component should be shared with the statsd part of the vmagent.
So can this component be named vmstatsd? This should be a separate process that is deployed before vmagent.
This approach increases the complexity of the architecture.
Alternatively, we can add a label to the data of multiple vmagents, similar to the instance label. As with the current use of the prometheus metric, the query would aggregate the counter metric of multiple vmagents in the promql query engine.
This approach increases the number of metric writes by a factor of several, and increases the query overhead.
Most helpful comment
Update:
vmstatsdshould be a part of vmagent in the future.