Cockroach: sql: support expression-based index columns

Created on 3 Oct 2016  路  13Comments  路  Source: cockroachdb/cockroach

Computed indexes are indexes on the result of an expression (as opposed to the direct value of a column). For example: CREATE INDEX a ON customer( LOWER(firstname) || ' ' || LOWER(lastname))

Suggested by @petermattis

A-sql-encoding A-sql-semantics C-enhancement C-performance X-anchored-telemetry

Most helpful comment

I'm trying out Craft CMS, a popular PHP CMS built with the Yii framework, and am hitting this.

CREATE INDEX "elements_sites_uri_siteId_idx" ON "elements_sites" (lower("uri"), "siteId")

All 13 comments

Would help with #7186 too.

This is one of the top unimplemented errors encountered by users

Also requested by @rkruze in #30690:

CREATE UNIQUE INDEX IF NOT EXISTS table_unique_idx
ON test_table
USING btree
((data->>'code'), (data->>'name'), (data->>'email_address')) where data->>'is_active'= 'true';

petition

We are hitting this also, some advance on this would be appreciated.

I'm trying out Craft CMS, a popular PHP CMS built with the Yii framework, and am hitting this.

CREATE INDEX "elements_sites_uri_siteId_idx" ON "elements_sites" (lower("uri"), "siteId")

Moving to Optimizer backlog, cc @RaduBerinde @mgartner

This is tracked by #24455.

I'm sorry you can't close an issue that's X-anchored-telemetry until the feature is complete and references to the issues in the code have been removed.

Alternatively you can send a PR that changes the references in the code from #9682 to #24455 then close this. (But even then it feels icky given the refernces in previous versions)

Copying over comment some recent info from #24455 below.


I did a bit of research of what work would be necessary to support expression-based indexes:

  • opt/cat interface and sql descriptor changes - need a way to refer to expressions as index "columns"
  • represent scans of expression-based indexes in opt (need "fake" columnids for constraints)
  • opt operator statistics
  • exploration rules for scans - need to recognize instances of the index expressions in filters
  • mutation paths - need to plumb values for the expressions when indexes are updated
  • schema change / backfiller support

Some aspects would be simplified if we treated the expressions as virtual columns (computed but not stored) throughout the stack, but other aspects would become more complicated (it would be the first non-scannable table column - and the code around different kinds of table columns is already very complex).

Overall the amount of work is on the same order of magnitude as partial indexes.

@RaduBerinde any updates on this?

We need to apply an expression-based unique index to ensure unicity over a JSON field like, in PostgreSQL you could do:

CREATE UNIQUE INDEX test_data_foo_uq ON test ((data->>'foo'))

+info: https://dba.stackexchange.com/questions/161313/creating-a-unique-constraint-from-a-json-object/161345#161345

But in CRDB, until this is solved, we are planning to replicate the columns in the parent tablet by using a computed field:

CREATE TABLE test (
    id INT, 
    data JSONB, 
    foo STRING AS ((data->>'foo')) STORED
    );

Then apply the UQ over it:

CREATE UNIQUE INDEX test__foo__uq ON test(foo);

@lopezator We've begun implementing expression-based indexes, but the feature won't be released until the 21.1 release at the earliest (Spring 2021). Until then, I think the workaround you mentioned is your best option.

Was this page helpful?
0 / 5 - 0 ratings