Influxdb: Support 64-bit unsigned integers

Created on 7 Jan 2017 · 27Comments · Source: influxdata/influxdb

Feature Request

Opening a feature request kicks off a discussion.
Requests may be closed if we're not actively planning to work on them.

__Proposal:__ [Description of the feature]
Support 64-bit unsigned integers

__Current behavior:__ [What currently happens]

ERR: {"error":"unable to parse 'test v=18446744073709551615i': unable to parse integer 18446744073709551615: strconv.ParseInt: parsing \"18446744073709551615\": value out of range"}

__Desired behavior:__ [What you would like to happen]
success

__Use case:__ [Why is this important (helps with prioritizing requests)]
Working with large numbers, such as with network traffic counters.

Ref: https://github.com/influxdata/telegraf/issues/2237

1.x kinfeature-request proposed support wontfix

Source

phemmer

👍8

All 27 comments

+1
Use case:
Working with large numbers, such as with RDBMS statistics counters.

orz-- on 9 Jan 2017

+1
See also #7804

chorsington on 9 Jan 2017

+1
uint64 support is a requirement for modern SNMP stats (Counter64), which is one of the oldest time-series use-cases I can think of. If rrdtool can handle it... :-)

britcey on 9 Jan 2017

Hi to everybody .

We have solved/bypassed this problem in our snmp collector client for influxdb (https://github.com/toni-moreno/snmpcollector) in two ways:

1) computing and sending only the difference ( or rates if desired) [rather than the real counter value]
2) float64 conversion for all cooked values.

If you need gather snmp uint64 counters, you can test it , (https://github.com/toni-moreno/snmpcollector/wiki)

toni-moreno on 11 Jan 2017

Hi, Toni-moreno san.

You did a great work.

But I think its just a workaround.
Is the workaround put in all agents?

In some cases, it is OK about lost small value.
But some of statistics need treat sensitively about small value.

package main
import (
    "fmt"
)
func main() {
    var stat1 uint64 = 18446744073709551615
    var stat2 uint64 = 18446744073709551500
    fmt.Println(stat1)
    fmt.Println(stat2)
    fmt.Println(stat1-stat2)
    fmt.Println(float64(stat1))
    fmt.Println(float64(stat2))
    fmt.Println(float64(stat1)-float64(stat2))
}

Result:
18446744073709551615
18446744073709551500
115
1.8446744073709552e+19
1.8446744073709552e+19
0 <- lost differential value!

orz-- on 26 Jan 2017

Hi @orz-- obviously the float conversion have been done after compute the difference, as you can see.

https://github.com/toni-moreno/snmpcollector/blob/master/pkg/data/metric/snmpmetric.go#L183

We are currently working with lots of different kind of devices (storage,networking,proying,firewalling) and everybody happy with the agent.

toni-moreno on 26 Jan 2017

+1 for this, losing accuracy when storing as a float64 is an issue for systems where the data is going to be used for accounting and billing. workarounds like @toni-moreno uses in snmpcollector only work for single collector deployments, as soon as you start horizontally scaling you're making heavy use of a memory cache and/or reading back from influx to generate the delta values client side

ragzilla on 16 Jun 2017

Hi @ragzilla, @orz-- I disagree about what is a solution and what is a workaround.

Snmpcollector is not losing accuracy it is indeed improving the way to store data as humans can understand and process, IMHO snmpcollector does not offer a workaround, it is really offering you a great solution with lots of new features .

After working for some time with snmp what I can see is 90% of data are counters, if you store all this data in the database you will need to compute a non_negative_derivative for each series before any other analysis , this is an unneeded overload on the database, especially if you need compute aggregations, or for comparatives between series.

Snmpcollector sends to the database important data as user can understand , also offers you the way to avoid uneeded CQ by doing simple evaluations as you can see in the following pictures (hrStorageSize_bytes,hrStorageUsed_bytes,hrStorageUsed_bytes), or transforming , data (strings to float , in linux_load1m, linux_load5m, linux_load15m ) BITS cheking as in platformVoltageState).

You can also filter data that you really need , in the below picture you can see 3 filters, for selecting only file-systems (not other kind of storage) when gathering data from HOST-RESOURCES-MIB, or only physical discs when gathering data from UCD-DISKIO-MIB diskIOTable, or only gething data from ports when port is UP.

You can also inspect in real time what exactly the collector is gathering to check it is what you expect, in the runtime view, and fix config if not.

Once important data is sent to de influxdb we delegate aggregations/filtering/ordering/combining to the database.

Right now we have one Snmpcollector instance gathering all needed data from approx 600 devices ( switches, fw,routers, proxies, dns , and some linux based appliances) and sending 300K metrics /minute to our production influxdb with only 1Gb Heap (we need only store one value when COUNTERXX type is selected).

@ragzilla @orz-- as you can see snmpcollector doesn't compute differences as a workaround for the leak of support for unsigned 64b integers in the influxdb , it does as a way to get important data and cooked them before to store in the database.

Feel free to test and post doubts on how snmpcollector is gathering data in our site https://github.com/toni-moreno/snmpcollector.

Thank you very much.

toni-moreno on 19 Jun 2017

I see
https://github.com/influxdata/influxdb/pull/8591
IMHO, Value's update is good news.
I need stored MySQL's Performance Schema & Oracle's Dictionary's value.

orz-- on 13 Jul 2017

IMHO I'd not consider conversion to float64 as even a workaround because you loose the guarantee to read back the same numbers you put in. There are cases where you can accept the approximation but that could be unacceptable in other cases. Also as someone already said we are just talking about 64bit integers and not about creating some new and esoteric never-seen-before-in-computer-science feature.

Regarding computing differences, I agree it's just a workaround for a subset of use cases and not a broader solution. In general whenever I see some data collector computing a difference I try to avoid it because that means switching from a stateless to a stateful collector.

Example from the real world: we often put data in InfluxDB coming from distributed monitoring systems (eg. Icinga2) and monitoring plugins could get executed on any node in a cluster, and thus they cannot rely on central entity to keep their state. Eg. many plugins that read snmp network interface counters save the actual counters on local filesystem and compute differences, and when they get run on a new node you get weird data because you've lost previous state.

That's one of the reasons we think it would be useful for us and for many other people to have native 64bit integers in InfluxDB.

lesinigo on 21 Jul 2017

👍1

This just bit us when moving from the php client to the golang one where we had typed our data correctly (memory as uint64) and resulted it in not working at all as it was trying to insert strings into already integer fields.

Seems like a significant oversight to me.

stevenh on 21 Jul 2017

@lesinigo IMHO store direct counter values in one database is not a natural human way to store information.

We assume we are working with snmp counters and communication devices

Can you tell me what exactly means that you have right now a value of 18,446,744,073,709,551,000 in the ifHCInOctets counter in one interface of a router ?

With this information you have nothing about the state of the interface. You will need another value to make an idea on the input state, so you will query again after a while ( suppose 1 minute). and you will get 18,446,744,073,709,551,614.

After gathering two values and compute one difference you will know that in that interface have been count only 614 bytes of data. with one "stateless" agent as you suggest, you will need store 2 values and do 1 derivative operation to get information about interface throughput.

If you have an agent with ability to compute differences it will send you to the db the value=614 as an integer or 614.00 as a float, and you will need store only 1 value and compute 0 operations to get real interface input throughput information.

With this example, can you see any loss of information in this value? .

And suppose the worst case, remember that 64 floating point has 52 bits of mantissa (https://en.wikipedia.org/wiki/Double-precision_floating-point_format) , that will give you at least 15 decimals (1.000.000.000.000.000 ). So you will get lost of precision when difference between two gathering periods (1 min ) will be greater than 1 Peta byte of data.

Can you give me an example of network device with ability to process more than one petabyte of data in 1 minute?

As I said before , computing differences will give you important data without precision lost (at least in communications devices context) , and will save you to do continuous derivative query's (80 % of snmp metrics are presented via snmp as counters).

Now I will assume we are working in other context (not communication devices) with differences greater than float64 precision. In this case , could be "perhaps" needed store direct unsigned 64 bit integers , and I agree that any snmp agent should support 64 bits counters , in this context we can wait for influxdb support it. because snmpcollector already does ( https://github.com/toni-moreno/snmpcollector/wiki/Component:-SNMP-Metrics , with Counter64 datasource type).

toni-moreno on 22 Jul 2017

Looks like there's a PR up for this over at #8835

phemmer on 15 Sep 2017

@phemmer We are working on this now. There will be a series of PRs. Actually, we added uint64 capability to the underlying storage some time ago. Now we are getting back to building out the wire protocol, query engine, response paths, tests.

rbetts on 19 Sep 2017

👍1

What's the current status of uint64 support?

I notice this ticket is still open, and the line protocol reference still doesn't mention unsigned integers.

However the Telegraf data formats documentation says:

  ## When true, Telegraf will output unsigned integers as unsigned values,
  ## i.e.: `42u`.  You will need a version of InfluxDB supporting unsigned
  ## integer values.  Enabling this option will result in field type errors if
  ## existing data has been written.
  # influx_uint_support = false

Does this mean that Telegraf can output uint64 values, but InfluxDB can't ingest them yet?

candlerb on 26 May 2018

@candlerb InfluxDB supports uint64 but it’s gated behind a build flag for now as it’s not generally available. Telegraf support is somewhat mixed, however with the config flag it’ll import/export uint line protocol.

ragzilla on 26 May 2018

Since I'm also planning to use InfluxDB to store 64-bit SNMP counter it would be helpful to know if there are any updates about this issue.
I also noticed that there is already a pull request (#8923) for the unsigned 64-bit integer support. Are there any plans that future InfluxDB versions will support unsigned 64-bit integer by default?

ix-dev on 22 Jan 2019

@ix-dev The comment above (https://github.com/influxdata/influxdb/issues/7801#issuecomment-392261484) is still accurate.

rbetts on 23 Jan 2019

As ix-dev says, this is a MUST in network infrastructure monitoring. Any news on when will be implemented?

lucasbritos on 22 Feb 2019

@lucasbritos It's supported right now, see my comment in this issue you're reading right now.

ragzilla on 23 Feb 2019

👍1

@ragzilla sorry if missing something, but is there any documentation on how activate it or in which version is available?

lucasbritos on 23 Feb 2019

It's been available since 1.4 (#8835).

You need to build the binary from source using -tags uint or -tags uint64 on your go build. You can also build packages by passing the build tag to the build script but I don't have the syntax offhand (though it looks to be --build-tags uint64 as an argument to build.py)

ragzilla on 23 Feb 2019

@ragzilla as #8835 points, this is freezed since 1.4 until all stack supports uint64.

I using TICK so changing the InfluxDB build (which I didn't find documentation) it's just a part of the issue.

Should we expect some news ever? this is from two years ago.
I reopened https://github.com/influxdata/telegraf/issues/2237 as https://github.com/influxdata/telegraf/issues/5476 on Telegraf side.

lucasbritos on 24 Feb 2019

Telegraf supports uint64 output so long as 'influx_uint_support = true' is set on the influxdb output. See influxdata/telegraf#3948. I'm not sure about kapacitor or chronograf.

ragzilla on 24 Feb 2019

Thank you for your support and patience, I didnt see it first in this post. I finally could store uint64 on InfluxDB with Telegraf.

Just in case for anyone is lost as I was:
Telegraf: Option influx_uint_support = true
Influxdb: I forked influxdb project, change Dockerfile branch 1.7 adding "-tags uint64" in go install run.

Docker hub repository
https://hub.docker.com/r/lucasbritos/influxdb

lucasbritos on 24 Feb 2019

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

stale[bot] on 23 Jul 2019

This issue has been automatically closed because it has not had recent activity. Please reopen if this issue is still important to you. Thank you for your contributions.

stale[bot] on 30 Jul 2019

Was this page helpful?

0 / 5 - 0 ratings