Influxdb: Question: Upper limit for tags

Created on 27 Jul 2015 · 11Comments · Source: influxdata/influxdb

I have a small question. Is there an upper limit to how many tags we can add in InfluxDB?

Source

shilpisharma

Most helpful comment

Not really. There used to be a 64k limit but that was removed with the TSM engine. I suspect 4GB would be the limit now, but anything above a few thousand KB seems like a bad idea, just for throughput concerns.

Remember, the full uncompressed tag set lives in memory as the index. No better way to chew up RAM than with 10KB tag names and values.

beckettsean on 25 Jul 2016

👍5 ❤1

All 11 comments

From the docs

As a rule of thumb, keep tag cardinality below 100,000. The limit will vary depending on the resources available to InfluxDB, but it is best to keep tag cardinality as low as possible. If you have a value in your data with high cardinality, it should probably be a field, not a tag.

desa on 27 Jul 2015

The 100k number is very rough. If you stay below that you should be fine. With tag cardinality in the millions schema and query design become more important, as it becomes easier to create poor performance situations.

We will document this more extensively as performance testing matures.

beckettsean on 27 Jul 2015

An extension to the question: is there a limit to the length of a tag?

kotwal13aditya on 22 Jul 2016

👍2

Remember, the full uncompressed tag set lives in memory as the index. No better way to chew up RAM than with 10KB tag names and values.

beckettsean on 25 Jul 2016

👍5 ❤1

Thanks @beckettsean

kotwal13aditya on 25 Jul 2016

Sorry to add to this issue, but by tag cardinality does that include both tag keys and tag values or just one or the other?

elvarb on 19 Feb 2017

@elvarb usually we mean a single tag key with many tag values as the the cardinality of a tag. As mentioned above this is very much just a rule of thumb and is increasingly not relevant.

At the moment we're working on https://github.com/influxdata/influxdb/issues/7151 which will remove these types of restrictions.

desa on 21 Feb 2017

@desa thanks for the info, I have been testing using influxdb to track logs and using tags for the log metadata. Extremely promising and glad to hear that this possible problem will be removed in the future.

elvarb on 21 Feb 2017

eek. we just ran into this. given that retention policies can't go less than an hour we are eating all available ram (16g) within the hour with (fairly) unique tag sets of source / destination IP/port plus volume counters (cisco netflow logging). aggregating doesn't help coz just the initial data is killing us, let alone our desire to archive aggregated data. we've got a stress-test python script that simulates our load in case it's interesting.

-i