Cockroach: sql: COUNT(*) returns odd results

Created on 23 Sep 2017  路  12Comments  路  Source: cockroachdb/cockroach

Version CockroachDB CCL v1.1-beta.20170907-120-g6f7e3b0

I backed up some data from reg and restored it on sapphire and did a quick sanity check count(*) and was surprised to see 4m rows. I tried to narrow the date range where the new rows appeared and instead found pretty much any WHERE clause got rid of them, even a NULL OR NOT NULL:

> SELECT count(*) FROM updates WHERE timestamp IS NULL OR timestamp IS NOT NULL;
+----------+
| count(*) |
+----------+
|  1436341 |
+----------+
(1 row)

Time: 2.381654597s

> SELECT count(*) FROM updates;
+----------+
| count(*) |
+----------+
|  4308600 |
+----------+
(1 row)

Time: 8.987759834s

root@localhost:26257/registration>

Most helpful comment

18524 is flawed because it doesn't account for tables with multiple column families. I'll submit a fix today, along with tests for COUNT(*) on tables with more than 1 column family, which apparently did not exist before.

All 12 comments

I tried SET DISTSQL = off; but it doesn't change anything.

@vivekmenezes for routing

Sounds like it could be related to #18524, but the version number you state doesn't seem to include that PR. Could you please double check to see whether the version you were testing with includes that PR?

@dt can you retry your test with the latest changes on master including #18702

Thanks!

root@localhost:26257/> set database = registration;
SET

Time: 888.017碌s

root@localhost:26257/registration> SELECT version();
+---------------------------------------------------------------------------------------------------+
|                                             version()                                             |
+---------------------------------------------------------------------------------------------------+
| CockroachDB CCL v1.1-alpha.20170817-1058-gfbcd8ca (linux amd64, built 2017/09/24 15:09:04, go1.9) |
+---------------------------------------------------------------------------------------------------+
(1 row)

Time: 578.011碌s

root@localhost:26257/registration> SELECT count(*) FROM updates;
+----------+
| count(*) |
+----------+
|  4308600 |
+----------+
(1 row)

Time: 13.667848111s

root@localhost:26257/registration> SELECT count(*) FROM updates WHERE timestamp IS NULL OR timestamp IS NOT NULL;
+----------+
| count(*) |
+----------+
|  1436341 |
+----------+
(1 row)

Time: 5.45897963s

root@localhost:26257/registration>

@arjunravinarayan can you follow up with @dt on this issue. Thanks!

FWIW this isn't a distsql issue as set distsql=off doesn't change the output as @dt mentioned above. ISTM this is unrelated to #18702.

Ok, an EXPLAIN helps to reveal what's going on here. COUNT is returning different results for a scan over the timestamp index vs the primary index.

root@:26257/registration> explain SELECT count(*) FROM registration.updates where timestamp is null or timestamp is not null;
+-------+--------+-------+-------------------------------+
| Level |  Type  | Field |          Description          |
+-------+--------+-------+-------------------------------+
|     0 | group  |       |                               |
|     1 | render |       |                               |
|     2 | scan   |       |                               |
|     2 |        | table | updates@updates_timestamp_idx |
|     2 |        | spans | ALL                           |
+-------+--------+-------+-------------------------------+
(5 rows)

root@:26257/registration> explain SELECT count(*) FROM registration.updates;
+-------+--------+-------+-----------------+
| Level |  Type  | Field |   Description   |
+-------+--------+-------+-----------------+
|     0 | group  |       |                 |
|     1 | render |       |                 |
|     2 | scan   |       |                 |
|     2 |        | table | updates@primary |
|     2 |        | spans | ALL             |
+-------+--------+-------+-----------------+
(5 rows)

On register itself, the index and the primary index don't match either.

root@localhost:26257/registration> SELECT count(*) FROM updates;
+----------+
| count(*) |
+----------+
|  1458683 |
+----------+
(1 row)
root@localhost:26257/registration> SELECT count(*) FROM updates@updates_timestamp_idx;
+----------+
| count(*) |
+----------+
|  1458828 |
+----------+
(1 row)

So it seems there are 2 problems here.

The original issue that @dt reported is caused by #18524. I'm not sure the cause yet but I'll be reverting it if I can't figure out a fix soon.

There's also the issue that the two indexes don't match in the source data. cc @vivekmenezes for possible cross-reference with #18533.

18524 is flawed because it doesn't account for tables with multiple column families. I'll submit a fix today, along with tests for COUNT(*) on tables with more than 1 column family, which apparently did not exist before.

Going to fork this issue for the remaining investigation of the second issue - why the primary index and secondary index do not match in size.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

xudongzheng picture xudongzheng  路  3Comments

magaldima picture magaldima  路  3Comments

bdarnell picture bdarnell  路  4Comments

petermattis picture petermattis  路  4Comments

couchand picture couchand  路  3Comments