Opentelemetry-specification: ValueRecorder - default aggregation

Created on 5 Jun 2020  路  45Comments  路  Source: open-telemetry/opentelemetry-specification

https://github.com/open-telemetry/opentelemetry-specification/blob/master/specification/metrics/api.md#interpretation

Is this correct that for ValueObserver we have aggregator MinMaxSumCount instead of LastValue ?.
If this is not a bug can you please explain how would you track metric if you are only interested in last value as currently if I understand this correctly there is no such possibility.

api p1 required-for-ga metrics

Most helpful comment

I'm a bit confused @obecny. Are you suggesting that if there are 1000 values recorded with a ValueRecorder over an interval, that we should still only report the LastValue? If so, you can't compute the min or the max or the count, which are all relevant and useful for ValueRecorder data points.

All 45 comments

There's a slight distinction to make between the aggregation that's selected and whether it's possible to know the last value. If you use an observer instrument, there can be at most one value per label set per instrument. Therefore, the Max equals the Min value equals the Last value. I would stipulate that the appropriate representation for MinMaxSumCount when Count=1 is as a single value.

When aggregations are applied locally to an asynchronous instrument, then MinMaxSumCount produces data other than the LastValue aggregator would, because Count can be greater than 1. In these cases, I suspect LastValue is not what the user wants. I believe conceptually MinMaxSumCount is always a good choice.

I believe the real issue that you're raising is that users want a mapping from OTel metric instrument to a Gauge exporter in a legacy system. One approach that I've considered is to simply state that when exporting ValueObserver directly into a legacy metrics system from a client library, choose a Gauge instrument. This is probably correct in most cases. A slight modification would be to state that Gauge should be selected only when there is no local aggregation, that otherwise MinMaxSumCount should be chosen. In these situations, I would still specify that MinMaxSumCount is the correct default aggregation for the collector to use when receiving data via OTLP and there are no other configurations.

The other approach that's been discussed came up in #608. If we create two new Grouping instruments, one synchronous and one asynchronous (#625), then users will have a choice of whether they want their Grouping instrument to have MinMaxSumCount or LastValue aggregation by default. I think #608 is unlikely to pass given the current feedback, so this probably won't happen.

@bogdandrutu I'm interested in your thoughts here.

There was agreement in the SIG call today that we should change defaults to use the LastValue aggregation for ValueRecorder and ValueObserver. (I may propose an alternative, which would be called RandomValue, but either way I encourage you to go ahead and implement LastValue, since users are expecting Gauges in Prometheus and Statsd and it is the least expensive option.)

I will file an issue stating this later today.

Why would ValueRecorder have LastValue as the default aggregation? ValueObserver I understand, but what is the rationale for ValueRecoder not having MMSC as the default?

I'm struggling with this issue for ValueRecorder. The issue is that there are three conventional ways to expose data from a synchronous grouping instrument in Prometheus: as a Gauge, a Histogram, or a Summary. As it stands, ValueRecorder generates MMSC which looks like a Summary, so otel-collector exports a Prometheus summary which users are not enjoying.

I'm against creating new instruments that have identical semantics and only are distinguished by their default aggregator. If it were not for OTLP and the collector, I would try to change the spec to say that exporters will have a say in the default aggregator: if the Exporter has a good way to represent MMSC, then it should take that option, but if you're Prometheus, using a Gauge is the only "Good, Cheap" option. With OTLP and the collector involved, I am struggling to find a default behavior that will result in "Good, Cheap" behavior for Prometheus.

One idea I had was to change the default to MinMaxSumCountLast, i.e., add the last value to MMSC so that it can generate a Gauge or a MMSC, but that idea is bad from a statistical point of view, plus its five numbers not one, so it loses some of the advantages of LastValue.

A better idea I had is to introduce a RandomValue aggregator to use as the default for ValueRecorder. It will select a random item from the stream, uniformly, and record it as a single exemplar for the distribution. This qualifies as cheap, but is it good? I think it is, because aggregating this data gives us the distribution--with enough hosts or collection intervals the empirical distribution emerges. I'm afraid LastValue won't do in this case, because it is biased to the moment of collection.

To summarize, I'm interested in changing the spec to say that the operator (i.e., installer of the SDK) and/or Exporters have a say in what the default aggregation should be, since even at the cheap end of the scale, systems differ in how they want to treat these measurements. Prometheus would prescribe Gauge for ValueRecorder. NewRelic would prescribe MinMaxSumCount for ValueRecorder, but this is not very satisfactory because OTLP has to make a decision. If OTLP used RandomValue as the default ValueRecorder aggregator, then exporters on the collector side will have to work with it. Prometheus can export the random value as a gauge value, but now we have a situation where Prometheus acts somewhat differently if exported locally vs through the collector. Maybe Prometheus should always export a random value as a gauge.

As it stands, ValueRecorder generates MMSC which looks like a Summary, so otel-collector exports a Prometheus summary which users are not enjoying.

I don't know much about Prometheus. Why is this such a bad experience?

I'd like to dig deeper into this, but my understanding is that Summary types have traditionally been used for quantiles, but the quantile summaries cannot be merged and are therefore generally discouraged. If Prometheus were changed to support a mergeable Summary this might not be a problem. Of course MMSC is trivially mergeable, but it's a special case from Prometheus' point of view.

Another "solution" here would be to use MMSC to compute the average value, and let Prometheus report a Gauge with the average. (I still favor this RandomValue idea.)

I guess my concern is that if we're making a fairly draconian decision based solely on a prometheus limitation, that might not be the best reason to make that decision.

Ending up with only have a last-or-random value aggregation be the default for a ValueRecorder would be a really non-optimal out-of-the-box for our customers, for example. If I'm recording latency timings for something of medium throughput and only got a random value each reporting period, that seems like a terrible default, rather than providing some kind of statistically interesting result by default. If we could add a mean to MMSC (making it MMMSC?), that seems much more generally useful out of the box across a wide spectrum of backends.

If we can find support for exporting a Gauge of the mean value in Prometheus, then I think it's a good solution. MMSC already has the average value (i.e., Sum / Count) so we don't have to change much other than specify how to export MMSC values to Prometheus.

I am not sure if "draconian" is entirely fair. :-) I think "frugal" or "thrifty" could also apply.

ah, yes, of course, you can compute the mean from the MMSC. foggy meeting brain got in the way of thinking. ;).

If we can find support for exporting a Gauge of the mean value in Prometheus, then I think it's a good solution. MMSC already has the average value (i.e., Sum / Count) so we don't have to change much other than specify how to export MMSC values to Prometheus.

Exporting the mean value could be very misleading. For long-running metrics that have collected a lot of measurements, recent measurements that are far from the mean will have a hard time triggering alerts which would be based off of mean instead of LastValue. For this same reason, RandomValue is also not what people will expect. Most of the open-source world is coming from Prometheus where gauges provide a LastValue snapshot in time. I truly believe LastValue is what people expect.

Why are thinking that a ValueRecorder should map to a Gauge? This seems extremely counter intuitive to me. I really want a distribution as the output, and MMSC is the cheapest distribution money can buy.

I think we've been thinking too much about distributions as being stable with respect to time, and/or we've been assuming relatively short collection intervals. I've come around on this: knowing the last value tells us something time-dependent whereas otherwise the distribution statistics do not.

I will propose that we add a Last value to MMSC, making it something like MinMaxLastSumCount, and let it be the new default. This will allow exporting either value on the other side of an OTLP connection. In a local configuration, probably the extra work can be avoided.

In particular, I'd like it if the OTLP protocol specified that MMLSC of a single data point can be encoded as a single data point where Min == Max == Last == Sum and Count == 1.

I think you lost me at "being stable with respect to time". Can you explain, or point this non-statistics-guru at some description of what this means?

If the MMSC has a start and end time on it, that seems to pretty well define the time period for what it is describing. I'm guessing your meaning something else, though.

@jkwatson I was thinking that a "stable" distribution as one where there is no change in the distribution with respect to time--nothing time dependent. We'll know the start and end time of the observation window, and knowing the last timestamped event value inside the window tells us still more about the events. (Knowing the first value is similar, knowing both is still more information).

Comparing the last value with the mean value could tell you something about a tendency in the data, for example a rising mean with respect to time.

The last timestamp can be useful on its own. If there is a large gap between the last timestamp and the end of the window (compared with the expected arrival time), it might reveal that something bursty was happening earlier in the window.

I like the idea of including the "last" recording exemplar. 馃憤

I would like to ask again if you guys managed to decide?. Currently we are using the LastValue for ValueObserver and ValueRecorder. We are going to make a release soon so if that is wrong it would be good to know and change it to whatever is the most correct one, thx :)

The decision is to use MinMaxLastSumCount. I don't see that it will matter for you at this point, since OTLP isn't ready for this change. A built-in Prometheus exporter should export the LastValue either way.

The decision is to use MinMaxLastSumCount. I don't see that it will matter for you at this point, since OTLP isn't ready for this change. A built-in Prometheus exporter should export the LastValue either way.

MinMaxLastSumCount ? we have MinMaxSumCount so there is now way of having LastValue from it. Is this a new aggregator MinMaxLastSumCount ?

MMLSC will replace MMSC and include a LastValue in addition to the other fields, this is to ensure that remote Statsd/Prometheus exporters have access to the last value.

See https://github.com/open-telemetry/oteps/pull/117 and https://github.com/open-telemetry/oteps/pull/118.

MMLSC will replace MMSC and include a LastValue in addition to the other fields, this is to ensure that remote Statsd/Prometheus exporters have access to the last value.

See open-telemetry/oteps#117 and open-telemetry/oteps#118.

works for me, thx

I think "default aggregation" for any observer should be "passthrough". All observers are aggregations calculated by a third-party aggregator (not in our SDK), and we should simply just report the data to the exporter:

  • SumObserver - is a monotonic cumulative Sum aggregation already.
  • UpDownSumObserver - is a non-monotonic cumulative Sum aggregation already.
  • ValueObserver - is a Gauge (unknown aggregation).

The SDK should simply configure a "passthrough" aggregation as default for all observers which will not do anything more than just pass data from the Observer instrument to the exporter.

I think we may want to interpret "default aggregation" different when there are multiple points produced. For example if a ValueObserver is configured with dimensionality reduction, then it may output multiple points with the same label set. This is why I've advocated to use MinMaxLastSumCount as the default for ValueObserver: https://github.com/open-telemetry/oteps/pull/118

But I agree that when there is no dimensionality reduction, i.e., exactly one value per collection period, that Gauge is a good default.

"dimensionality reduction" someone needs to configure this, it is not default, so then user should configure "MinMaxLastSumCount" then or what they believe makes more sense while doing the aggregation that reduces dimensions.

@jmacd @jkwatson I think we derived from the initial issue, and we went too far in proposing "MinMaxLastSumCount" or other things.

Is this correct that for ValueObserver we have aggregator MinMaxSumCount instead of LastValue ? If this is not a bug can you please explain how would you track metric if you are only interested in last value as currently if I understand this correctly there is no such possibility.

As pointed in the description and since then @obecny did not respond, I believe the issue was that in the current specification we exported "MinMaxSumCount" for "ValueObserver", which is something that we fixed in OTLP and we will export a "Gauge" for "ValueObserver".

From there we derived to different topics including "dimensionality reduction". If user wants to define a "dimensionality reduction" they also need to tell what aggregation to use since we don't know the aggregation used by the third-party aggregator, and in general they should use "MinMaxSumCount" or "Histogram" when defining "dimensionality reduction". There may be a confusion that user can define "dimensionality reduction" without specifying an aggregation as well, which I think causes this problem, for me the only config that user can change without specifying an aggregation is "enable/disable" default "aggregation" for an instrument. Anything else requires user to specify what aggregation they want.

We don't have MinMaxSumCount for ValueObserver but MinMaxLastSumCount for reasons you have mentioned yourself. That was my exact concerns when there was a change from LastValue to MinMaxSumCount and then we agreed to change it to what I already mentioned MinMaxLastSumCount. This is also what @jmacd mentioned here
https://github.com/open-telemetry/opentelemetry-specification/issues/636#issuecomment-653238817

@obecny I would suggest that for ValueObserver we don't have any "MinMaxSumCount" just the "LastValue" that will be a "Gauge" in OTLP because this is exactly what we get from the Observer

@bogdandrutu
Currently we have MinMaxLastSumCount for VALUE_RECORDER and VALUE_OBSERVER, should this be then aligned so VALUE_RECORDER will also have LastValue and then in fact we will not have MinMaxLastSumCount at all nor the MinMaxSumCount ?

@obecny I think ValueObserver and ValueRecorded will be different(I think probably a rename for ValueObserver to GaugeObserver may be needed to better signal what it does):

  • ValueObserver -> LastValue
  • ValueRecorder -> MinMaxSumCount

@bogdandrutu is there already an updated spec for that or at least draft ?

No, this is what I am trying to discuss here. Is this proposal good? Should we move forward and propose this in the specs?

If we are about discussing then I would always add lastValue to MinMaxSumCount so it will be MinMaxLastSumCount. Why? many reason but even both of the metrics have a "value" in name so I would always be interested in knowing lastValue for both. With MinMaxSumCount I can't get the last value which is weird. I'm fine for LastValue or MinMaxLastSumCount
otherwise I cannot even display it easily to see if they are changing or not - just try to make a simple linear graph to show how the value was changing over time - you can't do it.

Ok, so if I understand correctly you are saying that even for ValueRecorder (where the name Value may be a confusing) you would want to see the LastValue, this may be arguable but I can see some value in it.

But for the ValueObserver/GaugeObserver why do you need MinMaxSumCount?

Also if I have to always export the LastValue and the MinMaxSumCount I would do it in 2 different metrics, because the first is a Gauge and the second is trying to represent a distribution of values.

Ok, so if I understand correctly you are saying that even for ValueRecorder (where the name Value may be a confusing) you would want to see the LastValue, this may be arguable but I can see some value in it.

But for the ValueObserver/GaugeObserver why do you need MinMaxSumCount?

Bogdan I'm fine with either LastValue or MinMaxLastSumCount. Having only LastValue is much more valuable then MinMaxSumCount as I can count then the rest myself (min, max etc.) but without lastValue this is not possible and I cannot do any calculation.

I can see some value in having already the basic aggregators (min, max etc.) but only when I also have the lastValue, otherwise I would get rid of MinMaxSumCount in favor of LastValue only.

I'm a bit confused @obecny. Are you suggesting that if there are 1000 values recorded with a ValueRecorder over an interval, that we should still only report the LastValue? If so, you can't compute the min or the max or the count, which are all relevant and useful for ValueRecorder data points.

Also

Also if I have to always export the LastValue and the MinMaxSumCount I would do it in 2 different metrics, because the first is a Gauge and the second is trying to represent a distribution of values.

This is not a conclusion that I reach. LastValue does represent a distribution of values. It's the maximum timestamp that was present in the data set. It's also a gauge, but you can think of Min, and Max as gauges. The Summary type as we had it (IMO) was able to represent N > 1 values in a single data point.

I'm a bit confused @obecny. Are you suggesting that if there are 1000 values recorded with a ValueRecorder over an interval, that we should still only report the LastValue? If so, you can't compute the min or the max or the count, which are all relevant and useful for ValueRecorder data points.

Imagine simple situation, you want to create a CPU usage graph that will show current value (last value), and min, max.
How would you do it ?

  1. If I only have lastValue I can display this and add some calculation. I would choose ValueObserver with LastValue aggregator and then do the rest calculation myself.

  2. Now let's improve point 1 - I don't want to make any calculation, how can I achieve that ? I would think about choosing something that can save more points and do some aggregation for me, here comes the ValueRecorder. But because you want the ValueRecorder not to have lastValue I need to use 2 metrics. Now what will be cheaper. Using 2 metrics with 2 aggregators or just one with MinMaxLastSumCount so ValueRecorder with MinMaxLastSumCount.

If ValueRecorder has MinMaxSumCount aggregator, why it cannot simply hold the lastValue what is the cost of that?

If ValueRecorder won't have MinMaxLastSumCount and I still don't want to use 2 metrics, I would rather create my own batcher where I change the aggregator for ValueRecorder so that I can have it. But why would I do it, if the ValueRecorder can already support that ?

And just to be clear I'm only talking about ValueRecorder to have MinMaxLastSumCount aggregator, for ValueObserver LastValue is perfectly fine - originally it had MinMaxSumCount which was the reason why all of this conversation started.

@obecny very nice example, I would personally solve it differently:

  1. A lot of the backends have the possibility to calculate Min/Max on demand for a Gauge/LastValue metric, so you don't actually need to calculate anything by yourself.
  2. In case you want to report more granular measurements (more often) you still can use ValueObserver and configure an export time for this metric to be more often (not yet supported in our SDK, but this will be supported). Once you do that you not only have the ability to see min/max for the more granular metric but also when display the gauge metric see spikes.
  3. I would never use ValueRecorded for CPU (and for any metric that is calculated outside of our library). If the solution 2 is not acceptable because of the storage cost on the backend you choose, using "to be fully specified" views, you will be able to set "scrape time" to be every 1 second for this ValueObserver and export time to be every 10 seconds and produce a MinMaxSumCount + LastValue (explain next why I am using 2 different "views").

If ValueRecorder has MinMaxSumCount aggregator, why it cannot simply hold the lastValue what is the cost of that?

Here you are talking about an "available" Aggregator in the SDK, that we can offer. Which I agree that we can easily build an aggregator that is called MinMaxLastSumCount. If you don't mind can you tell me what backend do you have in mind that supports reporting in only one Metric MinMaxSumCount and LastValue/Gauge? I don't know about any to support this, so most likely you will end up with 2 metrics on the backend side.

When it comes to "implementation" detail, I would say that users should be aware that they configure 2 different "views/metrics" (based on my observation that in all the backends these will be 2 different metrics at least), but internally we can "merge" these 2 metrics requested by the user into only one "Aggregator" - similar argument can be made if user asks for "LastValue" + "Histogram" we can still have a smart logic and have a "combined" Aggregator. So I don't see why "MinMaxLastSumCount" is a special case compare to any other combination of 2 views/metrics.

And just to be clear I'm only talking about ValueRecorder to have MinMaxLastSumCount aggregator, for ValueObserver LastValue is perfectly fine - originally it had MinMaxSumCount which was the reason why all of this conversation started.

Happy that we agree on the ValueObserver.

Given the proposal in https://github.com/open-telemetry/opentelemetry-proto/pull/199, I'm getting confused about how we could represent Min, Max, Sum, and Count fields. There has been a motion to consider Histogram or DDSketch as defaults for ValueRecorder, to which I'd add that an Exact aggregation (i.e., exporting raw data points) would also work, but that's a performance problem we'd like to avoid as a default.

@bogdandrutu thx for detailed explanation it makes sense and I'm glad that ValueObserver will have LastValue only :)

I retitled this because we have wide agreement that the default aggregator for ValueObserver should be LastValue (i.e., produce Gauge data points).

Was this page helpful?
0 / 5 - 0 ratings