Currently, when a backfill is needed for a column being added or dropped, the column backfiller rewrites the existing primary index in-place to add or remove the column value. This has some disadvantages compared to the implementation of index backfills (for non-interleaved indexes):
ClearRanges (which is what the index GC job currently does). A column backfill can only be cleaned up by running the "reverse" column backfill.The proposal is to change the implementation of column backfills to build a new (primary) index every time, instead of mutating the existing one. There are 2 proposed steps in this change (which are somewhat independent):
Rolling back a column schema change, with this proposed approach, would essentially mean switching back to the pre-existing index. If we rolled back dropping a column, we still have all the column data as of the timestamp when the schema change was started.
This approach would also allow us to do ALTER TYPE in a single index backfill operation (see #46893).
Open questions:
Big 馃憤 .
Rolling back a column schema change, with this proposed approach, would essentially mean switching back to the pre-existing index. If we rolled back dropping a column, we still have all the column data as of the timestamp when the schema change was started.
Out of curiosity, when do we have to roll back dropping a column?
Out of curiosity, when do we have to roll back dropping a column?
The schema change can be cancelled, or another schema change started in the same transaction can fail (e.g., a unique index can't be built), in which case we roll everything back.
This would also fix https://github.com/cockroachdb/cockroach/issues/35738.
Out of curiosity, when do we have to roll back dropping a column?
The schema change can be cancelled, or another schema change started in the same transaction can fail (e.g., a unique index can't be built), in which case we roll everything back.
Got it. #46541 seems not good. Trying to roll back a partially dropped column seems fraught. I wonder if we can arrange to perform the drop column schema changes last. And then not allow them to be rolled back.
I wonder if we can arrange to perform the drop column schema changes last. And then not allow them to be rolled back.
We've talked about something like this in relatively vague terms. It seems like so long as we leave the dropped column public until all other mutations which can fail complete, and then make sure we remove all constraints before dropping a column, then make the drop column mutation something which cannot be canceled then we'll be out of this pickle.
Today we assume that the transaction which performs the drop sees the column in WRITE_ONLY for subsequent statements. My thinking is we should have that be true for that transaction but not actually commit the table descriptor in write only. This also gives us an opportunity to drop the constraints prior to making the dropped column non-public. That too is hazardous to roll back. My sense is as soon as we start dropping properties we need to make it impossible to roll back. In short:
1) Commit Time
I agree with all this. One thing to clarify: the new index that gets built in step (2) will exclude the columns being dropped, right? So if we have to roll back after validation at the end of step 2, we keep using the original index as though nothing happened. Otherwise, step (4) (which involves multiple table descriptor versions) consists of stepping through the states for the dropped column (but with no backfill needed), and the very last step is swapping to the new index.
This all sounds fantastic.
Will we collapse multiple related schema changes into a single step? If we add two columns to a table in a transaction, would we write a single new primary key index, or two. This relates to Lucy's point about dropping a column at the same time we're rewriting the primary index to add a column.
Will we collapse multiple related schema changes into a single step? If we add two columns to a table in a transaction, would we write a single new primary key index, or two.
I think we'll always be able to execute multiple schema changes started in the same transaction with just one rewrite of the primary index. That rewrite can encompass all column add/drop operations as well as changes to the primary key.
@ajwerner @rohany @RichardJCai and I discussed this in person.
First a few notes, then the main problem:
The main problem with this proposal ("the generalized index backfiller") is with interleaved tables: If an index has interleaved children, it's not possible to write an updated index with a new index ID without also rewriting all its interleaved descendants. Depending on whether we decide this would be prohibitively expensive, we have two options:
(NB: The interleaved ancestors of the table would not need to be rewritten. The index backfiller can already backfill secondary indexes interleaved into another table.)
Issues with option 1:
ALTER COLUMN TYPE will require 2 column backfills (as in the original RFC draft).Issues with option 2:
There are a few special performance concerns when backfilling an interleaved index compared to a non-interleaved index (applies to both options, but more so to (2)). We'll always be writing SSTs that overlap with existing keys in the span even if the new keys themselves are sorted, and there's some impact on traffic on the parent tables due to latch acquisition (not sure how significant this is). All this is already true for interleaved secondary indexes and interleaved ALTER PRIMARY KEY.
I think the next step is to figure out whether we can tolerate rewriting all the descendant tables for option (2).
Most helpful comment
@ajwerner @rohany @RichardJCai and I discussed this in person.
First a few notes, then the main problem:
The main problem with this proposal ("the generalized index backfiller") is with interleaved tables: If an index has interleaved children, it's not possible to write an updated index with a new index ID without also rewriting all its interleaved descendants. Depending on whether we decide this would be prohibitively expensive, we have two options:
(NB: The interleaved ancestors of the table would not need to be rewritten. The index backfiller can already backfill secondary indexes interleaved into another table.)
Issues with option 1:
ALTER COLUMN TYPEwill require 2 column backfills (as in the original RFC draft).Issues with option 2:
There are a few special performance concerns when backfilling an interleaved index compared to a non-interleaved index (applies to both options, but more so to (2)). We'll always be writing SSTs that overlap with existing keys in the span even if the new keys themselves are sorted, and there's some impact on traffic on the parent tables due to latch acquisition (not sure how significant this is). All this is already true for interleaved secondary indexes and interleaved
ALTER PRIMARY KEY.I think the next step is to figure out whether we can tolerate rewriting all the descendant tables for option (2).