Cockroach: COUNT(*) on empty table in GROUP BY is different from Postgres

Created on 26 Feb 2020  Â·  10Comments  Â·  Source: cockroachdb/cockroach

Consider the following statements:

CREATE TABLE t0(c0 INT);
CREATE VIEW v0(c0) AS SELECT COUNT_ROWS() FROM t0 GROUP BY 0+0;
SELECT * FROM v0; -- expected: {}, actual: {0}

Unexpectedly, the query fetches a row, even though t0 does not contain any rows. Using COUNT(*) on v0 works as expected:

SELECT COUNT(*) FROM v0; -- {0}

I found this issue based on commit 3bb183403b4bc75c25fbcfc69d7d35de76d2b984.

C-bug blocks-randomized-testing

All 10 comments

Hmm, this doesn't seem like a bug with views - it seems like a general bug with count and aggregate. Changing the title to reflect.

CockroachDB:

jordan=# CREATE TABLE t0(c0 INT);
CREATE TABLE
jordan=# select count(*) from t0 group by 0+0;
 count
-------
(0 rows)

Postgres:

[email protected]:53487/defaultdb> CREATE TABLE t0(c0 INT);
CREATE TABLE

[email protected]:53487/defaultdb> select count(*) from t0 group by 0+0;
  count
---------
      0
(1 row)

I was looking at this as well -- it seems like our count operator always returns 1 row, regardless of the grouping columns matching anything?

(also, thanks as usual for reporting, @mrigger!)

We tried handling this behavior when implementing COUNT for the vectorized engine: #39007

It seems like the 0+0 might actually be part of the problem here.

In CockroachDB:

# correct
root@localhost:26257/defaultdb> select count(*) from t0 group by c0;
  count
---------
(0 rows)

# correct
root@localhost:26257/defaultdb> select count(*) from t0 group by 0;
ERROR: GROUP BY position 0 is not in select list
SQLSTATE: 42P10

# incorrect
root@localhost:26257/defaultdb> select count(*) from t0 group by 0+0;
  count
---------
      0
(1 row)

# maybe useful
root@localhost:26257/defaultdb> explain select count(*) from t0 group by 0+0;
    tree    |    field    | description
------------+-------------+---------------
            | distributed | true
            | vectorized  | false
  group     |             |
   │        | aggregate 0 | count_rows()
   └── scan |             |
            | table       | t0@primary
            | spans       | ALL

Also, it seems like the behavior is the same with both vectorized=auto and experimental_on

(And same with vectorize=off, just for completeness)

I think it's a bug when grouping by the set of empty columns. That's equivalent to saying there's 1 group containing all rows in the table (or 0 groups if there are 0 rows in the table).

This was supposed to be fixed by plumbing the isScalar flag through execution. There is a remaining case where the old len(groupCols) == 0 is used: https://github.com/cockroachdb/cockroach/blob/master/pkg/sql/rowexec/aggregator.go#L312

For posterity, slack discussion with some more info: https://cockroachlabs.slack.com/archives/C04U1BTF8/p1582755388057200

Unfortunately, this is also a problem in vectorized. It may be a more general problem there, I don't see any isScalar flag passed to NewHashAggregator.

Was this page helpful?
0 / 5 - 0 ratings