System info: [Include InfluxDB version, operating system name, and other relevant details]
InfluxDB 0.12.2 on Ubuntu 14.04, fresh install
Steps to reproduce:
DISTINCTExpected behavior: [What you expected to happen]
Return from DISTINCT(<string>) would be the number of distinct strings.
Actual behavior: [What actually happened]
Return was null.
Additional info: [Include gist of relevant config, logs, etc.]
> show field keys from a
name: a
-------
fieldKey
string
value
> select * from a
name: a
-------
time string value
1461695896546713026 12 12
1461856121749763139 11 11
1461856127357529714 10 10
1461856128790092843 10 10
1461856129686020379 10 10
1461856131452838575 11 11
1461856136245585327 13 13
> select distinct(value) from a
name: a
-------
time distinct
0 12
0 11
0 10
0 13
> select distinct(string) from a
> select count(string) from a
name: a
-------
time count
0 7
> format json
> select distinct(string) from a
{"results":[{}]}
@jsternberg ^^
@jsternberg I created the data with the following:
> insert a value=12,string="12"
> insert a value=11,string="11"
> insert a value=10,string="10"
> insert a value=10,string="10"
> insert a value=10,string="10"
> insert a value=11,string="11"
> insert a value=13,string="13"
@beckettsean I'm unable to reproduce this with 0.12.2. I'm using one of the pending Docker images to ensure I have a clean slate.
> create database mydb
> use mydb
Using database mydb
> insert a value=12,string="12"
> insert a value=11,string="11"
> insert a value=10,string="10"
> insert a value=10,string="10"
> insert a value=10,string="10"
> insert a value=11,string="11"
> insert a value=13,string="13"
> select * from a
name: a
-------
time string value
1461857260244815974 12 12
1461857267066315206 11 11
1461857271181150496 10 10
1461857275369303495 10 10
1461857279493054542 10 10
1461857287331499229 11 11
1461857292976276325 13 13
> select distinct(value) from a
name: a
-------
time distinct
0 12
0 11
0 10
0 13
> select distinct(string) from a
name: a
-------
time distinct
0 12
0 11
0 10
0 13
> select count(string) from a
name: a
-------
time count
0 7
> format json
> select distinct(string) from a
{"results":[{"series":[{"name":"a","columns":["time","distinct"],"values":[[0,"12"],[0,"11"],[0,"10"],[0,"13"]]}]}]}
Can you confirm you're using the correct version?
On checking, my system is actually 0.12.1. Is there reason to believe this is fixed on 0.12.2?
# influx
Visit https://enterprise.influxdata.com to register for updates, InfluxDB server management, and monitoring.
Connected to http://localhost:8086 version 0.12.1
InfluxDB shell 0.12.1
>
Still happens for me on 0.12.2:
# influx
Visit https://enterprise.influxdata.com to register for updates, InfluxDB server management, and monitoring.
Connected to http://localhost:8086 version 0.12.2
InfluxDB shell 0.12.2
> use mydb
Using database mydb
> select distinct(value) from a
name: a
-------
time distinct
0 12
0 11
0 10
0 13
> select distinct(string) from a
>
Replicated on new data:
> insert b value=1,string="a"
> insert b value=2,string="b"
> insert b value=2,string="b"
> insert b value=1,string="a"
> insert b value=3,string="c"
> select * from b
name: b
-------
time string value
1461859961560359056 a 1
1461859965408355832 b 2
1461859966311601860 b 2
1461859967615202434 a 1
1461859972269952068 c 3
> select distinct(value) from b
name: b
-------
time distinct
0 1
0 2
0 3
> select distinct(string) from b
>
Not a syntax issue as far as I can tell:
> select distinct("string") from b
> select distinct('string') from b
ERR: error parsing query: expected field argument in distinct()
>
I spun up a brand new instance, installed 0.12.2, and DISTINCT operates as expected:
# influx
Visit https://enterprise.influxdata.com to register for updates, InfluxDB server management, and monitoring.
Connected to http://localhost:8086 version 0.12.2
InfluxDB shell 0.12.2
> create database mydb
> user mydb
ERR: error parsing query: found USER, expected SELECT, DELETE, SHOW, CREATE, DROP, GRANT, REVOKE, ALTER, SET, KILL at line 1, char 1
Warning: It is possible this error is due to not setting a database.
Please set a database with the command "use <database>".
> use mydb
Using database mydb
> insert a number=1,string="a"
> insert a number=1,string="a"
> insert a number=2,string="b"
> insert a number=2,string="b"
> insert a number=3,string="c"
> select * from a
name: a
-------
time number string
1461860834329907105 1 a
1461860836969349116 1 a
1461860841620218718 2 b
1461860842520594123 2 b
1461860848010369577 3 c
> select distinct(number) from a
name: a
-------
time distinct
0 1
0 2
0 3
> select distinct(string) from a
name: a
-------
time distinct
0 a
0 b
0 c
I'm not sure what it is about my other installations that prevents this, but it appears that it cannot be reproduced on a brand new box.
I looked at the underlying data set and I think it's caused by a side-effect of a bad decision in the underlying query engine. If a shard returns no iterators for a query, it returns a fake float iterator. When casting is done, floats are given priority and all non-float iterators are discarded, making it appear like there's no data.
When selecting as a raw field, this doesn't happen since the work is done through auxiliary iterators which don't have this problem. I'll start working on a fix.
To be clear, this should happen with any aggregation function for anything lower than a float that has differing output types (it should not affect count()).