These points are a regression vs. Prometheus:
inf value in a metric, neighboring values will be truncated to an integer (presumably due to the compression algorithm).+/-inf in the storage with the corresponding max or min integerEspecially (1) is a critical bug which has caused severe precision loss in our data corpus.
Write these metric points:
foo 1.5
foo 2.33
Confirm raw points with a query:
http://localhost:8428/api/v1/query_range?query=foo[2h]&start=2020-09-08T03:00:00Z
{"status":"success","data":{"resultType":"matrix","result":[{"metric":{"__name__":"foo"},"values":[[1599540900,"1.5"],[1599541200,"2.33"]]}]}}
Write inf value to series:
foo inf
Note that inf is represented by max integer, and that previous datapoints have been truncated to integer:
{"status":"success","data":{"resultType":"matrix","result":[{"metric":{"__name__":"foo"},"values":[[1599540900,"1"],[1599541200,"2"],[1599541500,"2"],[1599541800,"2"],[1599542100,"9223372036854776000"]]}]}}
inf should be preserved as inf in the storage and queriesinf shouldn't affect dynamic range of compressionIn Prometheus case, the inf datapoint seems to become NaN, so it isn't graphed.
Perhaps the compression algorithm should have some sanity bounds on minimum precision.
prometheus does not treat inf / NaN specially: https://www.robustperception.io/get-thee-to-a-nannary
victoria-metrics-20200815-125320-tags-v1.40.0-0-ged00eb3f3
@belm0 , could you try VictoriaMetrics from the latest commit in master branch? ( currently 62fde80490af6d4ab4acee7ce75e098334f37ae3 ). It should ignore newly ingested inf values. This should prevent from precision loss for already stored data after writing inf values.
See build instructions.
FYI, the commit mentioned above has been included in v1.41.0.
It should ignore newly ingested inf values
@valyala respectfully, I disagree with VM ignoring Inf and NaN.
These values are useful for representing exceptional states within a series, and are supported in Prometheus as first class values for TSDB storage and query operators. They are standardized sentinel values that propagate through math operators in important ways that ad-hoc sentinels cannot (like using 9,999,999 to mean "big number", or -1 to mean "error").
I understand that a Grafana graph does not display these values, but that doesn't imply they are useless. Simple examples:
NaN, and we can graph the count or rate of NaN values in the series. And malfunctioning is distinct from the host device not being online for scraping, so NaN should not be conflated with the absence of a sample in the TSDB.Inf if there is no reflection. We don't want to represent this case as "no data" because that would drop important information, nor as an arbitrarily large number, because that would skew a naive statistics query in subtle ways. (In contrast, the output of a query operating on Inf values will be obviously wrong and prompt the author to correctly cover the Inf case.)VM is advertised as working with existing Prometheus data and queries. That should apply beyond the superficial. There are Prometheus users employing NaN and Inf, and it will be a sore spot for their migrations to VM.
Agreed that NaN and Inf values could be useful in certain cases. Unfortunately it may be very hard to add proper support for these values without significant changes in data encoding layer. This is unfortunate trade-off between encoding efficiency and supporting corner case values.
Let's start with documenting this behavior - see 0bccb58e802298348ac179dd055ff5288b76557c .
Let's keep this issue open in the hope a solution could be found eventually for proper handling of corner-case values at data encoding layer.
It looks like the solution for proper storing of Inf values has been discovered and implemented in the commit 26115891dba851ff8c98a222ef528b56a116ab20 . Now Inf values shouldn't lead to precision loss for nearby samples in the same time series.
This is unfortunate trade-off between encoding efficiency and supporting corner case values.
It's more fair to say that some users may not currently use Inf or NaN. But for those who use it, I don't think it's considered a corner case (see examples I gave). Prometheus limits values to numbers only, so these values can be significant for conveying more information in a type-safe way that propagates across operators.
It looks like the solution for proper storing of Inf values has been discovered and implemented in the commit 2611589
I couldn't tell from the commit. Will it be possible to query on Inf, like count(foo == Inf)?
I couldn't tell from the commit. Will it be possible to query on Inf, like count(foo == Inf)?
It seems to be working in the same way as in Prometheus. But, to be honest, I'm surprised with this, since, according to math rules, the result of inf == inf is undetermined.
I'm surprised with this, since, according to math rules, the result of inf == inf is undetermined
Are you confusing Inf and NaN? Inf is a valid value to use in equality and mathematical operations. NaN is not because, by definition, it's not a number.
$ python
>>> from math import inf, nan, exp
>>> inf == inf
True
>>> abs(-inf) == inf
True
>>> 1/inf
0.0
>>> exp(inf)
inf
>>> nan == nan
False
>>> exp(nan)
nan
FYI, VictoriaMetrics should properly store inf values starting from v1.41.1. Unfortunately this wasn't mentioned in release notes (see https://github.com/VictoriaMetrics/VictoriaMetrics/issues/789 ).
Are you confusing Inf and NaN? Inf is a valid value to use in equality and mathematical operations. NaN is not because, by definition, it's not a number.
yes, thanks for the pointer. I confused infinity with their applicability in limit calculations.
It looks like the solution for proper storing of Inf values has been discovered and implemented in the commit 2611589 . Now Inf values shouldn't lead to precision loss for nearby samples in the same time series.
The fix actually made things worse: NaN rather than truncation #805
Unit testing seems to be inadequate or at the wrong level?
Unit testing seems to be inadequate or at the wrong level?
Specifically, I suspect that the unit tests do not incorporate merge operations.
Specifically, I suspect that the unit tests do not incorporate merge operations.
Yes, unit tests didn't cover the case from #805 during merge operation. It is now covered in the commit bc7d67cee22d54ef389b258a21e0f0977b987afa .
Most helpful comment
Agreed that
NaNandInfvalues could be useful in certain cases. Unfortunately it may be very hard to add proper support for these values without significant changes in data encoding layer. This is unfortunate trade-off between encoding efficiency and supporting corner case values.Let's start with documenting this behavior - see 0bccb58e802298348ac179dd055ff5288b76557c .
Let's keep this issue open in the hope a solution could be found eventually for proper handling of corner-case values at data encoding layer.