Proposal: Add a SHOW CARDINALITY query to the database. This should be something that can be scoped by database, measurement, tag key, or tag key/value pair. Each one of these queries could be delivered separately.
These are just some ideas to get the conversation started:
-- number of series in a database
SHOW SERIES CARDINALITY
-- number of series for every measurement in a database
SHOW SERIES CARDINALITY GROUP BY measurement
-- number of series in a measurement
SHOW SERIES CARDINALITY FROM "<measurement>"
-- number of series with a given tag key/value pair
SHOW SERIES CARDINALITY WHERE "<key>" = '<value>'
-- number of series from a measurement with a given tag key/value pair
SHOW SERIES CARDINALITY FROM "<measurement>" WHERE "<key>" = '<value>'
-- number of measurements in a database
SHOW MEASUREMENT CARDINALITY
-- number of tag keys in a database
SHOW TAG KEY CARDINALITY
-- number of tag keys in a measurement
SHOW TAG KEY CARDINALITY FROM "<measurement>"
-- number of fields in a database
SHOW FIELD KEY CARDINALITY
-- number of fields in a measurement
SHOW FIELD KEY CARDINALITY FROM "<measurement>"
-- number of unique tag values
SHOW TAG VALUE CARDINALITY WITH KEY = '<tag key>'
-- number of unique tag values in a measurement
SHOW TAG VALUE CARDINALITY FROM "<measurement>" WITH KEY = '<tag key>'
The top two queries are probably the most important and easiest to wire up to begin with.
Use case: As a troubleshooting method this will help users determine which measurements, databases, or tags are causing problems. For users building out data exploration UIs, they'll be able to show cardinality numbers before measurements or tag keys are expanded (kind of like faceted search).
For users monitoring their databases, they would be able to use this functionality to trigger alerts if for some reason the cardinality of a measurement, database or tag key gets too high.
I like the idea and the syntax works for me. I would perhaps extend it to include SHOW DATABASE CARDINALITY as some users have hundreds of databases.
There's been a long-standing request for returning DISTINCT tag values (#3880), so if SHOW TAG VALUE CARDINALITY is implemented, perhaps we could also implement a SHOW DISTINCT TAG VALUES query as well?
@beckettsean I think SHOW DISTINCT TAG VALUES already exists as SHOW TAG VALUES WITH KEY = '<key>'
From reading that issue, what it looks like is they actually want COUNT(DISTINCT(...)) which would be the same as what SHOW TAG VALUE CARDINALITY would return.
@pauldix you are correct, it's already possible to get the DISTINCT via the show query, and the SHOW CARDINALITY would give the COUNT(DISTINCT()) equivalent. Ignore me!
+1, definitely required and a potentially mighty feature.
We are saving lots of sensor information and want to display counts of the tags in grafana, such as:
+1 - this would be an extremely powerful feature.
Maybe InfluxDB should implement schema discovery via standard SQL, as some RDBMS do.
MySQL implemented this years ago using the read-only information_schema views, in addition to the usual "SHOW ..." commands.
Interesting info, section "INFORMATION_SCHEMA as Alternative to SHOW Statements" : https://dev.mysql.com/doc/refman/5.7/en/information-schema.html
+1
This would be a very welcome addition! My current solution to monitor the number of series in a few important measurements involves a script calling "SHOW SERIES FROM XXX" and doing an external count, which unfortunately seems to randomly kill my Influx daemon at this point. (counts are over 150k) Would love to see this to help keep an eye on things at a more granular level.
Could we also have these functionality group by time?
For example finding unique tags with 1m,5m time window
+1
@beckettsean where are we with this feature? I see #3760 also asking for similar feature, and was told this issue is tracking it. I also see #8636 trying to add the same queries that @pauldix mentioned above. But none of them work in latest v1.3.7 [2017-10-26].
Not seeing it working as well.
this feature is in v.1.4. It is not part of v1.3.x
If it is v1.4 where is the docs for it?
found it:
https://docs.influxdata.com/influxdb/v1.2/troubleshooting/frequently-asked-questions/#how-can-i-query-for-series-cardinality
@maxp-edcast I think this is what you're after, your link referred to v1.2 and this was implemented in 1.4. https://docs.influxdata.com/influxdb/v1.5/query_language/spec/#show-cardinality
@gaddman -- you are correct!! Thanks for pointing out the update in the query language.
The 1.2 reference is the "old" way. It was more of an estimate and had some challenges in terms of accuracy -- particularly with the enterprise edition.
Most helpful comment
+1, definitely required and a potentially mighty feature.
We are saving lots of sensor information and want to display counts of the tags in grafana, such as:
...
but all of these are tags unfortunately.