__System info:__ InfluxDB version 1.5.4, Amazon linux 2018.03, i3.2xlarge instance type, DB size ~1.1TB
Upgraded InfluxDB from 1.4.3 to 1.5.4 (used influx_inspect buildtsi to rebuild indexes). After that show tag values query became super slow when tag has high cardinality. After playing with different parameters in conf, turned out that auth-enabled = true makes it really slow.
__Steps to reproduce:__
auth-enabled = true[root@ip-10-0-0-1 ~]# time influx -username 'foo' -password 'bar' -database baz -execute "show tag values from cpu with key = host;" > output
real 0m22.601s
user 0m0.186s
sys 0m0.136s
[root@ip-10-0-0-1 ~]# wc -l output
30369 output
[root@ip-10-0-0-1 ~]#
auth-enabled = false[root@ip-10-0-0-1 ~]# time influx -database baz -execute "show tag values from cpu with key = host;" > output
real 0m0.462s
user 0m0.194s
sys 0m0.112s
[root@ip-10-0-0-1 ~]# wc -l output
30369 output
[root@ip-10-0-0-1 ~]#
__Expected behavior:__ I would expect that authentication settings do not affect query performance.
__Actual behavior:__ Query became slow.
__Additional info:__ Full conf (egrep -v "^\s*#|^$" /etc/influxdb/influxdb.conf):
[meta]
dir = "/var/lib/influxdb/meta"
[data]
dir = "/var/lib/influxdb/data"
wal-dir = "/var/lib/influxdb/wal"
index-version = "tsi1"
cache-max-memory-size = "512m"
cache-snapshot-write-cold-duration = "10s"
compact-full-write-cold-duration = "24h"
max-index-log-file-size = "512k"
max-series-per-database = 0
max-values-per-tag = 1000000
[coordinator]
write-timeout = "5s"
query-timeout = "30s"
[retention]
[shard-precreation]
[monitor]
[http]
auth-enabled = true
pprof-enabled = true
[ifql]
enabled = false
[logging]
[subscriber]
[[graphite]]
[[collectd]]
[[opentsdb]]
[[udp]]
[continuous_queries]
@tarvip thanks for the report. I think I know what's going on here.
My suspicion is that because auth is enabled, the openAuthorizer will not be in use.
There are lots of places in the index where we short-circuit routines based on the presence of an openAuthorizer or a nil one. Anything that calls into query.AuthorizerIsOpen.
Rather than determining if the authorizer is open or nil, we probably need to check if the authorisation allows for reading the database in question. In that case query.AuthorizerIsOpen would return true. The signature would probably be changed to something like:
Authorized(a Authorizer, database string)
One could then call into a.AuthorizeDatabase.
Having a hard time with repro.
Setup:
$ inch -b 100000 -c 3 -t 100000 -f 10 -m 1 -p 1000 -v
> select count(*) from m0
name: m0
time count_v0 count_v01 count_v02 count_v03 count_v04 count_v05 count_v06 count_v07 count_v08 count_v09
---- -------- --------- --------- --------- --------- --------- --------- --------- --------- ---------
0 101311000 101299998 101299998 101299998 101299998 101299998 101299998 101299998 101299998 101299998
> show series exact cardinality
name: m0
count
-----
110002
md5-51166fbabe78d289f417ac5128c12fab
$ time influx -database stress -execute 'show tag values from m0 with key = tag0' -username admin -password admin | wc -l
100003
real 0m1.391s
user 0m0.669s
sys 0m0.751s
md5-6ce5ef1b299398a9cdd6bc05732882d7
$ time influx -database stress -execute 'show tag values from m0 with key = tag0' -username admin -password admin | wc -l
100003
real 0m2.247s
user 0m0.686s
sys 0m0.768s
@jacobmarble please try with:
$ inch -b 100000 -c 3 -t "10000,100,10" -f 10 -m 1 -p 10
for me it generated ~118G of data (generating data took less than an hour)
Auth disabled:
$ time influx -database stress -execute 'show tag values from m0 with key = tag0' -username admin -password admin | wc -l
10003
real 0m0.130s
user 0m0.117s
sys 0m0.034s
Auth enabled:
$ time influx -database stress -execute 'show tag values from m0 with key = tag0' -username admin -password admin | wc -l
10003
real 0m13.676s
user 0m0.099s
sys 0m0.032s
Tested with influxdb version 1.6.1, AWS instance type i3.2xlarge, NVMe SSD storage. OS: Amazon Linux 2 AMI
Fixed via influxdata/influxdb#10200
 and influxdata/plutonium#2760
This issue is still present, it is better now, but not completely fixed. The issue is that everything works if you have authenticated admin user, but it is still super-slow for authenticated non-admin users.
Could you please reopen this issue? There is still a huge difference between queries executed with the admin user and non-admin queries when working with high cardinality measurements. This renders Grafana dashboards almost unusable when using non-admin users in the data source configuration.
__Test Results:__
$ time influx -username grafana -password $GRAFANA_PW -database network_db -execute "SHOW TAG values from interface with key=dev_hostname"
real 0m21.201s
user 0m0.096s
sys 0m0.056s
$ time influx -username admin -password $ADMIN_PW -database network_db -execute "SHOW TAG values from interface with key=dev_hostname"
real 0m1.533s
user 0m0.120s
sys 0m0.048s
$ influx -username admin -password $ADMIN_PW -database network_db -execute "SHOW SERIES CARDINALITY FROM interface"
name: interface
---------------
count
2183704
__Environment info:__
__Config:__
reporting-disabled = true
hostname = "my-host"
[meta]
dir = "/mnt/local/storage/influx/meta"
logging-enabled = true
[data]
dir = "/mnt/local/storage/influx/data"
wal-dir = "/mnt/local/storage/influx/wal"
trace-logging-enabled = false
query-log-enabled = true
index-version = "tsi1"
max-series-per-database = 0
max-values-per-tag = 0
[coordinator]
query-timeout = "120s"
log-queries-after = "30s"
[retention]
check-interval = "24h"
[shard-precreation]
[monitor]
store-interval = "1h"
[http]
bind-address = "localhost:8086"
auth-enabled = true
log-enabled = true
https-enabled = false
[subscriber]
[[graphite]]
[[collectd]]
[[opentsdb]]
[[udp]]
[continuous_queries]
run-interval = "5m"
@dgnorton didn't we look at this recently? I had a feeling that the approach you had in mind wasn't effective when consumed by Enterprise though?
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
keep open
Most helpful comment
keep open