Vector: Define how metrics are converted to logs

Created on 9 Sep 2020  路  3Comments  路  Source: timberio/vector

3552 introduces a new metric_to_log transform (a very welcome addition). An issue was raised in that PR about the shape of log events as they are created from metric events (https://github.com/timberio/vector/pull/3552#issuecomment-684039516). Currently, that PR is blindly converting internal metrics to logs in a way that inherits our internal metrics schema. There are a few problems with this:

  1. The coupling means that any change to our metrics data model will be a breaking change for this use case.
  2. I'm not sure our internal metrics data model, in its exact form, is the best log-form representation of this data.
  3. We should consider converting all metrics to absolute values since relative values are not useful in the context of logging (ex: increments and decrements for counters). Humio recommended the Carbon 2.0 format for this.

This is blocking progress for a few large Vector users.

data model metrics must rfc task

Most helpful comment

Thanks @lukesteensen

An alternative would be to split these into two different types (e.g. HistogramSample and SummarySample), but the structure is truly the same and the name Distribution is relatively good.

I agree, I think Distribution is fine.

Again, an alternative could be to push that down into different types (e.g. Counter vs CounterDiff) but the naming could be confusing and certain concepts may overlap (e.g. are absolute counters just gauges?).

Agree again. Counter is fine.

And to clarify my thoughts here. Relative values aren't necessarily useless in the logging context. I can still compute the rate of change for that value, just not the absolute total without the entire data set. I was originally thinking we would drop them entirely because of this.

Overall, I think our current format is a reasonably good one for converting metrics into a structured log message.

Sounds good. I wanted to make sure we paused to think about this before rolling it out everywhere.


Thanks @chasers.

Honeycomb for example can't take nested data.

Yep, I've opened #3792 to doubled check that.

All 3 comments

This was discussed very briefly in #2021, but the scope was pretty much limited to console output. We can stick with the current serialization structure, but it's worth discussing if there are any implementation details that would be better hidden or warts that should be smoothed out.

The current serialization structure is a pretty direct translation of our internal model:

{
  "name": "login.count",
  "timestamp": "2020-11-07T21:15:47+00:00",
  "kind": "absolute",
  "tags": {
    "host": "my.host.com"
  },
  "counter": {
    "value": 24.2
  }
}

Metrics types other than counters will likewise be embedded directly matching their structure here.

One thing that's a little non-standard is the way that we embed a StatisticKind in the Distribution data type that specifies whether it is intended to be rolled up into a histogram or a summary. An alternative would be to split these into two different types (e.g. HistogramSample and SummarySample), but the structure is truly the same and the name Distribution is relatively good.

Another is the way we expose the concept of relative vs absolute measurements. Again, an alternative could be to push that down into different types (e.g. Counter vs CounterDiff) but the naming could be confusing and certain concepts may overlap (e.g. are absolute counters just gauges?).

Overall, I think our current format is a reasonably good one for converting metrics into a structured log message. I'd be curious to hear any feedback from people with experience using tools like humio to ingest metrics, but I assume any structured format like this would work reasonably well.

Hey guys 馃憢 was just perusing your repo and saw this. At Logflare we're trying to advocate for the same approach as Humio of "log your metrics" because "metrics are logs" so I think this is awesome. Some vendors may not be able to take nested data. Honeycomb for example can't take nested data. So I think you're either going to be fine with this proposed format as like a normal JSON object or you'll need to flatten it for people. But I do know you have a Honecomb sink right? So idk you probably know all this. Anyways, thanks for being awesome. 鉁岋笍

Thanks @lukesteensen

An alternative would be to split these into two different types (e.g. HistogramSample and SummarySample), but the structure is truly the same and the name Distribution is relatively good.

I agree, I think Distribution is fine.

Again, an alternative could be to push that down into different types (e.g. Counter vs CounterDiff) but the naming could be confusing and certain concepts may overlap (e.g. are absolute counters just gauges?).

Agree again. Counter is fine.

And to clarify my thoughts here. Relative values aren't necessarily useless in the logging context. I can still compute the rate of change for that value, just not the absolute total without the entire data set. I was originally thinking we would drop them entirely because of this.

Overall, I think our current format is a reasonably good one for converting metrics into a structured log message.

Sounds good. I wanted to make sure we paused to think about this before rolling it out everywhere.


Thanks @chasers.

Honeycomb for example can't take nested data.

Yep, I've opened #3792 to doubled check that.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

jamtur01 picture jamtur01  路  3Comments

binarylogic picture binarylogic  路  4Comments

valyala picture valyala  路  3Comments

LucioFranco picture LucioFranco  路  3Comments

a-rodin picture a-rodin  路  4Comments