It is not currently possible to change a table's primary key, e.g. via
ALTER TABLE foo DROP CONSTRAINT old_pkey;
ALTER TABLE foo ADD PRIMARY KEY (new, pkey, cols)
This becomes more important in 1.2 because the upcoming partitioning feature (#18683) will only allow partitioning by prefixes of the primary key. For example, suppose you want to partition this table by country:
CREATE TABLE users (
id INT PRIMARY KEY,
name STRING,
country STRING,
...
)
The table needs to have a composite primary key that lists country first:
CREATE TABLE users (
id INT,
name STRING,
country STRING,
...,
PRIMARY KEY (int, country)
) PARTITION BY LIST (country)...
This essentially requires that the schema be designed with partitioning in mind from the get-go until we can alter primary keys.
/cc @awoods187 for triage
@benesch, Do you mean we should support SQL syntax like:
ALTER TABLE foo ADD PRIMARY KEY (new, pkey, cols)
Or just change the primary key of table users from id to (id, country) , when we try to add the partition for this table but the partition column is not in the primary?
We should change the data organization when we try to change the primary key of the specified table, this operation's overhead is quite expensive:
/TableID/IndexID/ColumnID/id to /TableID/IndexID/ColumnID/(id, country), the value also changed with the same way, and clear the original key-value.@a6802739 I mean we should support arbitrary primary key changes like
ALTER TABLE foo ADD PRIMARY KEY (new, pkey, cols)
Since changing a table's primary key is such an expensive operation, we definitely don't want to do it implicitly if the partitioning scheme specifies columns that are not in the primary key. Much rather error in that case, and let the user alter the pkey only if they so choose.
Note that traditionally changing the pkey on a table with an existing pkey has required two steps:
ALTER TABLE foo DROP CONSTRAINT foo_pkey;
ALTER TABLE foo ADD PRIMARY KEY (new, pkey, cols);
That's going to be needlessly expensive in Cockroach. Dropping the primary key requires creating a hidden rowid column to use as the implicit pkey鈥攖his already happens when tables without a primary key are created from scratch鈥攁nd rewriting all the data in terms of rowdid, just to rewrite it all in terms of (new, pkey, cols) when the new primary key is added. We should do one or more of the following:
ALTER TABLE foo ALTER PRIMARY KEY TO (new, pkey, cols)
ALTER TABLE foo DROP CONSTRAINT foo_pkey, ADD PRIMARY KEY (new, pkey, cols);
Coalesce the operations when they appear in ALTER TABLEs within the same txn with no intervening writes:
BEGIN;
ALTER TABLE foo DROP CONSTRAINT foo_pkey;
ALTER TABLE foo ADD PRIMARY KEY (new, pkey, cols);
COMMIT;
My preference would be to eventually support all three, but start by supporting just option one. Options two and three are better for compatibility, but changing a table's primary key probably isn't something you want to lean on your ORM for.
Option three has some pathological edge cases FWIW:
CREATE TABLE foo (a INT PRIMARY KEY, b STRING);
INSERT INTO foo (a, b) VALUES (1, "a");
BEGIN;
ALTER TABLE foo DROP CONSTRAINT foo_pkey;
INSERT INTO foo (a, b) VALUES (1, "b");
ALTER TABLE foo ADD PRIMARY KEY (b);
COMMIT;
@jordanlewis @vivekmenezes @danhhz @knz @cockroachdb/sql-async does this sound reasonable?
Yes it's reasonable but @a6802739 please do not underestimate the complexity of the task! As much as possible, please split the work in small steps that can be easily reviewed. It might be beneficial that you study the problem and write down your implementation plan before you start (and perhaps share it with us).
I meant to write "reasonable" in my previous comment.
@knz, I will first try to understand what should I do for this task, and then write down how to implement it. Maybe I will have some confusion in this process, hope could get some help then.
Thank you very much.
@a6802739 Thank you, as always, for your help. I've discussed this project with the team, and because other high-priority projects with upcoming deadlines rely on this feature, we'll need to develop this feature internally.
assigned to @danhhz to make sure this finds a good home
Has there been any thought on how this would be implemented? Changing the primary key is difficult because it is stored in indexes. One thought is that we make a copy of all of the secondary indexes and backfill both them and the new primary index and afterwards delete the old secondary indexes and primary index. Seems like that would conceptually work, though it would be slow.
@mjibson, @danhhz, @vivekmenezes Thoughts?
It depends on whether there's a requirement for this to happen online.
If it needs to be online, which is what @benesch's first suggestion implies (ALTER TABLE ALTER PRIMARY KEY), then I agree with your suggestion. The phases in that case would be the same as adding an index, more or less:
n secondary indexes, the write path would then end up writing 2 * (n + 1) rows per update, guaranteeing that all writes from then on are reflected in the new tables.t before the write path began writing to the new copies.If it doesn't have to be online, then perhaps that first step of "dropping the primary key" could effectively just put the table into read only mode until a new primary key is added, at which point we could perform the backfills above without having to do the table descriptor double write path dance.
It depends on whether there's a requirement for this to happen online.
I'd be sad if changing a table's primary key took the table offline. That said, it would still a big improvement over our current "just dump it to CSV and reimport it!" recommendation.
I've been assuming that any schema change needs to be asynchronous. Let's not lower this bar yet.
We talked about this yesterday briefly, but just one data point from a customer who wanted to change column types generally (not just primary keys) -> okay with this locking up a table while the change is going on. I don't fully understand the term backfilling, but are we also using this term to describe the faster process we've been talking about re doing something with sstables (vague because I only get 20% of it)? If not, I wonder if someone would want a fast, but offline primary key swap versus a slow but online primary key swap.
okay with this locking up a table while the change is going on
I'm not sure I believe this. Well, I believe the customer has stated that locking the table is ok, but we've also heard many times that async schema changes massively reduce operational burden.
I think you're right - it's definitely 100x better to have async schema changes. That being said, if one takes way way longer than the other, perhaps it changes our decision? Either way, this is obviously an engineering decision, but just wanted to provide that single relevant data point here.
I think this is a low impact high effort issue. This change is valuable to a customer who is
I think number 4 in particular is rare.
I'll also bring up that when we do this we need to first refactor the codebase to accommodate this change (it should be clear that the code can be modifying two tables). It's very easy to make this change and land up with what we had with the sql executor but this time on the sql DML code paths.
I'm just saying that we better think about the weight of this change on the stability of the system and the priority of it for our customers for 2.1.
I agree that it's high effort, but I think you're understating the impact. We have at least two features where the primary key is crucial (partitioning and interleaving), and even without that people sometimes want to change their primary keys (existing databases all have this feature). It's unrealistic to expect people to be able to choose the correct PK when first creating the table.
okay with this locking up a table while the change is going on
I'm not sure I believe this. Well, I believe the customer has stated that locking the table is ok, but
we've also heard many times that async schema changes massively reduce operational burden.
If taking the table offline is acceptable, then an export/edit/import process would work. I think if we're going to do this we should do the work to make it async/online.
We talked about this yesterday briefly, but just one data point from a customer who wanted to change column types generally (not just primary keys)
Changing types of non-PK columns is a separate matter. It's much easier, and I think it's higher priority than changing PKs.
I suppose we can implement this within the same table ID space by allowing a primary index to have a different index id than the default primary index id. then the process might be
I'd add that I'd originally thought that this change was going to make the sql codebase a lot more complex because I thought we had to add the notion of writing to two tables. With my comment above I feel confident that we can implement this using the same table ID which is a significant reduction in complexity of this change IMHO.
One unknown is how this might affect our interleaved tables implementation, because that's an index with two or more table ids in it and cannot be modeled as a covering index. Some more thought is needed there.
This item only shows examples where the old primary key is removed first. As outlined in #40771 there's also a scenario possible where a table was defined without a primary key in the first place, and as such nothing was deleted.
@jordanlewis I'm perfectly fine with closing the issue I mentioned above. Should this issue also be given the label for postgres compatibility, or would that be counter-effective?
In CockroachDB a table always has a primary key behind the scenes, so this issue does cover your case as well. Thanks for noticing the missing compat label - I've added it.
Can this issue be closed once https://www.cockroachlabs.com/docs/v20.1/alter-primary-key.html is stable?
Yes! In fact, I'm going to go ahead and close this since it will be addressed in 20.1.