Consider the following statements:
CREATE TABLE t0(c0 INT);
CREATE VIEW v0(c0) AS SELECT COUNT_ROWS() FROM t0 GROUP BY 0+0;
SELECT * FROM v0; -- expected: {}, actual: {0}
Unexpectedly, the query fetches a row, even though t0 does not contain any rows. Using COUNT(*) on v0 works as expected:
SELECT COUNT(*) FROM v0; -- {0}
I found this issue based on commit 3bb183403b4bc75c25fbcfc69d7d35de76d2b984.
Hmm, this doesn't seem like a bug with views - it seems like a general bug with count and aggregate. Changing the title to reflect.
CockroachDB:
jordan=# CREATE TABLE t0(c0 INT);
CREATE TABLE
jordan=# select count(*) from t0 group by 0+0;
count
-------
(0 rows)
Postgres:
[email protected]:53487/defaultdb> CREATE TABLE t0(c0 INT);
CREATE TABLE
[email protected]:53487/defaultdb> select count(*) from t0 group by 0+0;
count
---------
0
(1 row)
I was looking at this as well -- it seems like our count operator always returns 1 row, regardless of the grouping columns matching anything?
(also, thanks as usual for reporting, @mrigger!)
We tried handling this behavior when implementing COUNT for the vectorized engine: #39007
It seems like the 0+0 might actually be part of the problem here.
In CockroachDB:
# correct
root@localhost:26257/defaultdb> select count(*) from t0 group by c0;
count
---------
(0 rows)
# correct
root@localhost:26257/defaultdb> select count(*) from t0 group by 0;
ERROR: GROUP BY position 0 is not in select list
SQLSTATE: 42P10
# incorrect
root@localhost:26257/defaultdb> select count(*) from t0 group by 0+0;
count
---------
0
(1 row)
# maybe useful
root@localhost:26257/defaultdb> explain select count(*) from t0 group by 0+0;
tree | field | description
------------+-------------+---------------
| distributed | true
| vectorized | false
group | |
│ | aggregate 0 | count_rows()
└── scan | |
| table | t0@primary
| spans | ALL
Also, it seems like the behavior is the same with both vectorized=auto and experimental_on
(And same with vectorize=off, just for completeness)
I think it's a bug when grouping by the set of empty columns. That's equivalent to saying there's 1 group containing all rows in the table (or 0 groups if there are 0 rows in the table).
This was supposed to be fixed by plumbing the isScalar flag through execution. There is a remaining case where the old len(groupCols) == 0 is used: https://github.com/cockroachdb/cockroach/blob/master/pkg/sql/rowexec/aggregator.go#L312
For posterity, slack discussion with some more info: https://cockroachlabs.slack.com/archives/C04U1BTF8/p1582755388057200
Unfortunately, this is also a problem in vectorized. It may be a more general problem there, I don't see any isScalar flag passed to NewHashAggregator.
Ah, got it, it's a copy of the same condition: https://github.com/cockroachdb/cockroach/blob/master/pkg/sql/colexec/execplan.go#L643