Cockroach: sql: support inverted index queries with != 1 paths

Created on 25 Jan 2018  Â·  13Comments  Â·  Source: cockroachdb/cockroach

Was left out of the initial PR, just wanted to make a tracking issue.

root@:26257/d> select * from j where b @> '{"a":1}' and b @> '{"b":2}';
+---+
| b |
+---+
+---+
(0 rows)
root@:26257/d> select * from j where b @> '{"a":1, "b":2}';
pq: constraints = , table ID = 52, index ID = 2: trying to use multiple keys in index lookup

also true for objects with 0 paths

root@:26257/d> select * from j where b @> '{}';
pq: constraints = , table ID = 52, index ID = 2: can't look up empty JSON
A-sql-json A-sql-optimizer C-bug S-3-productivity

All 13 comments

I think at the very least we could improve the user-facing error message here:

pq: constraints = , table ID = 53, index ID = 2: can't look up empty JSON

Right now as is, select * from j where b @> '{"a":1}' and b @> '{"b":2}' should actually work as a filter.

root@:26257/db> explain (verbose) SELECT * from d where b @> '{"a":"b"}' and  b @> '{"b":"c"}';          
+------------+-------+------------+--------+-----------------------------------------+-----------------+--------------------------+
|    Tree    | Level |    Type    | Field  |               Description               |     Columns     |         Ordering         |
+------------+-------+------------+--------+-----------------------------------------+-----------------+--------------------------+
| index-join |     0 | index-join |        |                                         | (a, b)          | b=CONST; a!=NULL; key(a) |
|  ├── scan  |     1 | scan       |        |                                         | (a, b[omitted]) | b=CONST; a!=NULL; key(a) |
|  │         |     1 |            | table  | d@foo_inv                               |                 |                          |
|  │         |     1 |            | spans  | /!NULL/"a"/"b"-/!NULL/"a"/"b"/PrefixEnd |                 |                          |
|  └── scan  |     1 | scan       |        |                                         | (a, b)          |                          |
|            |     1 |            | table  | d@primary                               |                 |                          |
|            |     1 |            | filter | b @> '{"b":"c"}'                        |                 |                          |
+------------+-------+------------+--------+-----------------------------------------+-----------------+--------------------------+
(7 rows)

This isn't very inefficient because it'll filter everything retrieved from the first inverted index lookup to match the other parts of the AND. The question is should we do inverted index lookups on every part of the query? I can also normalize b @> '{"a":1, "b":2}' into b @> '{"a":1}' and b @> '{"b":2}' so we can make use of this.

This could be made more efficient by #23520.

Thanks for info, I'll add it in once it's implemented!

On Thu, Mar 8, 2018 at 5:52 PM, Jordan Lewis notifications@github.com
wrote:

This could be made more efficient by #23520
https://github.com/cockroachdb/cockroach/issues/23520.

—
You are receiving this because you were assigned.
Reply to this email directly, view it on GitHub
https://github.com/cockroachdb/cockroach/issues/21786#issuecomment-371652688,
or mute the thread
https://github.com/notifications/unsubscribe-auth/Ab6W55AdkXnG-aVrLVOKNMa3LFGwvsBYks5tcbZBgaJpZM4RtB_B
.

2c on prioritization (I see it's under the later bucket): I was just doing normal testing for a demo and ran into this pretty quickly! Good to know it's already being tracked cc @awoods187

@nstewart what did you run into exactly? These queries should be supported, just slow.

This works fine (returns true):

select '{"color": "red", "model": "Kona"}'::jsonb @> '{"color": "red", "model": "Kona"}'

This fails:

> select * from vehicles where ext @> '{"color": "red", "model": "Kona"}'; 
pq: logicalSpans = [[/'{"color": "red", "model": "Kona"}' - /'{"color": "red", "model": "Kona"}']], table ID = 51, index ID = 2: trying to use multiple keys in index lookup

Okay, I didn't realize. This should really get done by 2.0 if possible - at the very least the error message needs to be improved so that it's obvious what's going wrong and how to fix it.

@jordanlewis what is expected for 2.0 in this issue? Normalizing b @> '{"a":1, "b":2} into b @> '{"a":1} AND b @> '{"b":2}? (I thought we already did that?)

@RaduBerinde yes, exactly what you said. I also thought we already did that, but we don't. That's what I added in my patch. If there's a simpler way please chime in on that PR!

This is now fixed on release-2.0 only. It needs to be forward-ported to master.

Ping to @justinj, just want to make sure this doesn't fall under the radar for the optimizer team as it represents a regression. Actually, we probably want to get this done before the next alpha if possible. I'm happy to help on this if you point me where to look.

Not sure what this issue really refers to any more—we can handle inverted index queries like this properly now, but doing it fast will require hardening out the zigzag join support. I've opened #35690 to track getting zigzags on by default, but given the scope of this issue I think it can be closed.

Was this page helpful?
0 / 5 - 0 ratings