Cockroach: sql: explicit lock syntax (SELECT FOR {SHARE,UPDATE} {skip locked,nowait})

Created on 9 May 2016  路  50Comments  路  Source: cockroachdb/cockroach

Postgres (and MySQL as well) provide a FOR UPDATE clause which instructs a SELECT operation to acquire row-level locks for the data it is reading.

We currently don't support this syntax, and supporting it with the same semantics is more or less impossible anyway with our underlying concurrency model (we have no locks, and while we could put "intents" in place to simulate an intent to write, it wouldn't quite be the same).

We should discuss whether we want to add least accept the syntax so that frameworks which make use of it can at least proceed (even if they don't get the intended semantics, though it's not clear what the implications of that are).

Apparently the Quartz scheduler makes use of SELECT FOR UPDATE. It is in this context that this came up.

See also http://www.postgresql.org/docs/9.1/static/explicit-locking.html for the explicit locking modes exposed by Postgres.

A-kv-transactions A-sql-pgcompat A-sql-semantics C-enhancement X-anchored-telemetry

Most helpful comment

SELECT FOR UPDATE SKIP LOCKED is useful whenever you want to build a queue of tasks that multiple workers try to consume so that one a worker has locked some rows, other concurrent workers will skip those and continue with the unprocessed ones. IT's very useful in practice, that's why MySQL 8.0 is also adding support for it.

SELECT FOR UPDATE NOWAIT is also useful for latency critical situations when you don't want to get blocked until the lock is released on the lock timeout period expires.

All 50 comments

I think it is a better idea to only accept the syntax without supporting the functionality.

While this would need a lot of highlighting in the docs, many current RDBMS support this explicit locking mechanism (except MSSQL), so accepting it would relieve a lot of pain in modifying the apps.

In an open source library like quartz, accepting this without actually supporting means, possible errors at the commit time (due to OCC) which it doesn't handle.
But, in a custom application which someone looks to migrate to cockroach, handling errors upon commits would / should have been handled at a higher (framework?) level, so it should be fine IMO.

Note that FOR UPDATE is not standard SQL, so a portable application shouldn't rely on it without having an alternative, although it's reasonable to assume that FOR UPDATE exists in postgres mode so it would still be good to support it if we can.

We shouldn't support the syntax without providing the semantics that it implies. Fortunately, the locking modifiers are mostly semantically relevant at lower isolation levels; at SERIALIZABLE we'd be safe to accept the syntax as no-ops. I'd have to think more about how SNAPSHOT and FOR UPDATE would interact.

Even though it's not required for SERIALIZABLE semantics, we _might_ want to write a dummy intent in SELECT FOR UPDATE. This would allow a transaction to assert its timestamp and priority during an initial read-only phase so it can abort other conflicting transactions. I'm not sure whether that's helpful, though, or if it is consistent with expectations from this command in other databases.

This looks like the code in quartz scheduler that uses FOR UPDATE: https://github.com/quartz-scheduler/quartz/blob/fb14850e1133dd3d884f2b99748b12286df8afc6/quartz-core/src/main/java/org/quartz/impl/jdbcjobstore/StdRowLockSemaphore.java

It's using a SELECT FOR UPDATE on a dummy row in the database to act as a lock. What is this lock protecting? Does it just cover state that is in the database and accessed as a part of the transaction (if so, why is it needed? Why isn't ordinary transaction isolation sufficient?), or is it being used as an advisory lock to protect state outside the database (e.g. to ensure that only one node starts a scheduled job)? The latter isn't something that we (or any OCC database) can ensure with SELECT FOR UPDATE. If you want to use the database as a lock for external resources you'll need to actually commit writes to the database and not just rely in write locks during an open transaction.

On a related note, I got a question about SELECT FOR UPDATE SKIP LOCKED in a recent tech talk. This is a bit of a shady command because it means that Postgres will simply skip over rows which are already locked by someone else, but is apparently useful in practice when avoiding conflicts is more important than getting all the rows that matched the predicate (i.e. when updating large data sets incrementally and it's fine to skip over a few).

https://blogs.oracle.com/rammenon/entry/select_for_update_nowait_skip has some nice diagrams.

There's also SELECT FOR UPDATE NOWAIT, which on Postgres means that instead of waiting for a lock, the command immediately comes back with an error. This could also be useful in Cockroach: when in an auto-retrying transaction, one could imagine wanting to run a command only when it doesn't run into a conflict, which in our SQL layer would translate to not auto-retrying statements which contain a NOWAIT.

SELECT FOR UPDATE SKIP LOCKED is useful whenever you want to build a queue of tasks that multiple workers try to consume so that one a worker has locked some rows, other concurrent workers will skip those and continue with the unprocessed ones. IT's very useful in practice, that's why MySQL 8.0 is also adding support for it.

SELECT FOR UPDATE NOWAIT is also useful for latency critical situations when you don't want to get blocked until the lock is released on the lock timeout period expires.

Any ETA on supporting this SQL syntax? We have a project using openjpa and it generates this SQL using the postgresql driver.

TPCC also uses SELECT FOR UPDATE.

@jordanlewis I'm guessing that is an optimization, not a requirement given our default serializable isolation.

You're right.

SELECT FOR UPDATE also appears to be used by Liquibase and consequently also prevents us from using Cockroach with Keycloak. Is there any ETA on supporting this?

I'd be willing to try to contribute if the consensus is that the syntax should be supported without the intended semantics, as was suggested in the issue description.

I don't think we want to support FOR UPDATE as a no-op. For SNAPSHOT we need the real semantics (so SELECT FOR UPDATE should write an intent). We could technically make it a no-op in SERIALIZABLE mode but I think making it write an intent there too is useful to allow applications to control their lock ordering and avoid deadlocks.

That makes sense @bdarnell. Do you guys have plans to work on this anytime soon? I wouldn't mind trying to contribute, although I'd probably need a few pointers to get started because actually implementing the semantics sounds rather involved.

We don't currently have any plans to work on this, although we should probably get it on the roadmap (I'm marking it as 1.3 for now because the 1.2 release cycle is already pretty full, although this would be a good candidate for 1.2 if we find some slack in the schedule. cc @nstewart )

Implementing this would touch a lot of different parts of the code. No part is individually too tricky, but there are a lot of them (git grep -i reversescan will give you an idea of the scope). The bulk of the changes would consist of implementing new ScanForUpdate and ReverseScanForUpdate calls in the KV API. These would work similarly to regular scans, but A) would be flagged as read/write commands instead of read-only and B) after performing the scan, they'd use MVCCPut to write back the values that were just read (that's not the most efficient way to do things, but I think it's the right way to start since it will have the right semantics without complicating the backwards-compatibility story). Then the SQL layer would use these instead of plan Scan/ReverseScan when FOR UPDATE has been requested.

The bulk of the changes would consist of implementing new ScanForUpdate and ReverseScanForUpdate calls in the KV API.

Adding new API calls is a bit tedious. Could we add an update field to Scan and ReverseScan and change {Scan,ReverseScan}.flags() to indicate that these are write operations if that field is set? We have precedent for doing this with DeleteRangeRequest, though I recall there were arguments at the time over how that wasn't clean.

I thought about that but this seems like too big of a distinction to make based on a flag in the request instead of a whole new command. Imagine debugging command-queue problems when a Scan could be either a read or a write. I think it's better to do it cleanly this time.

PostgreSQL (https://www.postgresql.org/docs/current/static/sql-select.html) supports four different levels for the locking clause:

NO KEY UPDATE
SHARE
KEY SHARE

Do we want to support all four or only UPDATE and SHARE?

FOR UPDATE is the only one of those that I've heard of, so I think we can probably ignore the others (FOR SHARE would just be a no-op for us, right?). I'm not sure whether FOR NO KEY UPDATE and FOR KEY SHARE are even implementable in our model.

I agree that FOR NO KEY UPDATE and FOR KEY SHARE don't seem to be possible in the CockroachDB model, since we'd need to support locks at a sub-row granularity.

Should I at least support the syntax for FOR SHARE and make it a no-op? Or only FOR UPDATE? I guess the question is, in our model, if someone selects a set of rows with FOR SHARE (or just a regular SELECT as part of a transaction), and then someone else tries to select the same rows with FOR UPDATE, will the second transaction fail/block?

At least for serializable transactions, FOR SHARE would be a no-op. For snapshot transactions, I'm not sure whether we need to do anything or not. It might be safer to leave FOR SHARE unimplemented until we find someone who wants to use it.

Sounds good.

In addition to the 4 different "locking strengths", Postgres also supports locking on specific tables as well as different options for waiting on locks. The full syntax is:

FOR lock_strength [ OF table_name [, ...] ] [ NOWAIT | SKIP LOCKED ]

where lock_strength is one of { UPDATE | NO KEY UPDATE | SHARE | KEY SHARE }. This locking clause can also be repeated multiple times to use different options with different tables.

Do we want to support multiple tables with different options? From the above comments it sounds like we probably want to support NOWAIT and SKIP LOCKED, but I'm not sure about specific tables and/or different options for different tables. By default, if FOR UPDATE is used without specified tables, Postgres will lock all rows returned by the SELECT statement.

Another question is how precisely the locked rows should match the query results. Postgres generally locks only the rows returned by the query, but there are a few side cases when extra rows may be locked (e.g., with LIMIT... OFFSET). In CockroachDB, it seems like it may be difficult to guarantee that only rows matching the predicate will be locked.

Do we want to support multiple tables with different options?

I would start with the basic all-or-nothing version. We may want to do the per-table version eventually (if there is demand), but my guess is that the feature is rarely used enough that it's not worth the complexity at this time.

NOWAIT and SKIP LOCKED do seem useful but I wouldn't try to do them at the same time as FOR UPDATE; they seem like separate changes.

It's fine with me if the exact set of rows locked differs from postgres in some edge cases. This feature is less crucial for us because we have stronger transaction isolation anyway; I think the only times it will really matter will be the fairly simple cases.

All sounds good to me. But per Spencer's request, I've also created an RFC to summarize the discussions in this issue and discuss an alternative approach involving locking of ranges: PR #19577.

As described in PR #19577, we've decided to postpone implementing this feature. I've unassigned myself from this issue.

@HeikoOnnebrink ran into this limitation while trying to use https://www.keycloak.org/.
It's unclear whether KeyCloak would work if this were addressed, though. Something to look into.

Is there a status on this issue?
I have an application that relies on the select for update feature. Would love to run it on cockroachdb instead of postgresql.

Hey @hoeghh - we haven't yet prioritized this (i.e.: not on the docket for 2.2), but the feedback is helpful. I'll ping the PM for this area and see if we can prioritize it.

See the comments from the related thread:

https://github.com/cockroachdb/cockroach/pull/19577#issuecomment-456797219

and

https://github.com/cockroachdb/cockroach/pull/19577#issuecomment-456885632

We're not planning to prioritize this in 2.2 or anysoon after that. We would prefer instead to work with you to adapt your app and avoid locking altogether.

I've been trying to use Cockroachdb as a NATS Streaming storage (using postgres driver) but just realized it uses 'SELECT FOR UPDATE'. Please, prioritize it.
Thank you very much in advance!

@fabriciopf note the discussion above. This is unlikely to change; we would prefer to work with you to update your app to not require locking instead.

I think we should improve our locking errors, from syntax errors to a cleaner unsupported message with a hint pointing to here.

I think I have done just that recently?

[email protected]:21358/defaultdb> select * from t for update;
invalid syntax: statement ignored: syntax error: unimplemented: unimplemented at or near "update"
DETAIL: source SQL:
select * from t for update
                    ^
HINT: See: https://github.com/cockroachdb/cockroach/issues/6583

@HeikoOnnebrink ran into this limitation while trying to use https://www.keycloak.org/.
It's unclear whether KeyCloak would work if this were addressed, though. Something to look into.

I tried it before for Keycloak ver 4.X. Cockroach DB wouldn't work for number of reasons, one of them is due to incompatibility in this issue.

Any plan to prioritize / implement the workaround? That'd be the game changer!

@tbg @incubus8 there is some information about how to make keycloak work with CRDB: https://lists.jboss.org/pipermail/keycloak-user/2018-January/012673.html

I'm not tried it yet, but I'm interested to use CRDB for keycloak. If you have a success, then ping back.

@tbg @incubus8 there is some information about how to make keycloak work with CRDB: https://lists.jboss.org/pipermail/keycloak-user/2018-January/012673.html

I'm not tried it yet, but I'm interested to use CRDB for keycloak. If you have a success, then ping back.

That's just a lot of hacks! Thanks for sharing tho

work with you to adapt your app

So using logical locking / unlocking and client side wait loop with timeout - hmmm.

@bdarnell or @tbg, with an eye toward compatibility, can we discuss making FOR UPDATE a no-op in the 19.2 release (pending further work by core)? As mentioned above:

We could technically make it a no-op in SERIALIZABLE mode but I think making it write an intent there too is useful to allow applications to control their lock ordering and avoid deadlocks.

Since CockroachDB now only operates in SERIALIZABLE mode, I don't think clients can actually observe whether or not FOR UPDATE performed a lock. Therefore, it should be safe to make no-op until we get around to actually implementing it - and once it's implemented, again, clients should be shielded from observing any difference.

This is a very common blocker for moving applications from PostgreSQL, and it would be nice to ease that pain.

cc @andy-kimball as well, who mentioned this was a likely candidate for medium term future work.

This reads strangely to me.

For most other MVCC databases "Optimistic concurrency checking" aka "Optimistic Locking" ONLY works at READ_COMMITTED. That is, at READ_COMMITTED applications can operate without explicit row locking (applications can operate for the most part without using select for update ... and we do this to get high concurrency for OLTP workloads).

Once we go to REPEATABLE_READ or SERIALIZABLE then "Optimistic concurrency checking" does not work. An update statement when executed doesn't "see" any committed changes from any other transactions but instead sees everything AS AT it's own transaction start time. This means that at REPEATABLE_READ or SERIALIZABLE isolation levels applications can no longer use optimistic locking and instead MUST use pessimistic locking (applications must start explicitly using select for update to avoid lost updates etc).

For example, the JPA specification expects READ_COMMITTED isolation level. Oracle's default isolation level is READ_COMMITTED.

Might need to check if optimistic concurrency checking works for CockroachDB ... sounds like it doesn't.

@jordanlewis Yeah, I've been thinking recently that making FOR UPDATE a no-op is a reasonable way to go. It'll break things that expect to use FOR UPDATE as a purely advisory lock but I think that's an acceptable risk that we can document (and we already implemented some postgres advisory lock function as no-ops, which is harder to justify than a no-op SELECT FOR UPDATE).

@rbygrave CockroachDB uses a mix of optimistic and pessimistic concurrency techniques; see our transaction model docs for more. SQL isolation levels are orthogonal to optimistic or pessimistic concurrency control. SELECT FOR UPDATE is a workaround for the limitations of lower isolation levels; in SERIALIZABLE SELECT FOR UPDATE is not necessary because even read locks provide the necessary isolation guarantees. (serializable read locks are cheaper than the write locks acquired by SELECT FOR UPDATE, although they do come with somewhat higher risk of transaction restarts due to serialization failure)

FYI: My testing shows:

AT SERIALIZABLE

Cockroach

  • TransactionRetryWithProtoRefreshError: TransactionRetryError: retry txn (RETRY_WRITE_TOO_OLD): ...
  • Losing transaction fails at COMMIT

Postgres

  • PSQLException: ERROR: restart transaction: ...
  • Losing transaction fails at UPDATE

Oracle

  • ORA-08177: can't serialize access for this transaction
  • Losing transaction fails at UPDATE

AT READ_COMMITTED

Cockroach

  • Not applicable

Postgres & Oracle

  • OptimisticLockException detected
  • Losing transaction fails at UPDATE

Note that OptimisticLockException (as per Postgres/Oracle at READ_COMMITTED) can be caught and handled by application code allowing the transaction to continue. This is typically cases of known expected competition. I am not sure yet how that would work with CockroachDB as concurrency failure seems to be always be detected at COMMIT (is that right? That would preclude OCC as I know it). Detection at COMMIT time seems suggests it can't be 'caught' and handling by application code allowing the transaction to continue - the transaction must instead be retried.

SQL isolation levels are orthogonal to optimistic or pessimistic concurrency control.

Well, I don't see that with Postgres or Oracle. That said, I haven't seen "Optimistic locking" as I know it work with Cockroach yet. Is there an example of what CockroachDB thinks optimistic locking is?

SERIALIZABLE SELECT FOR UPDATE is not necessary because even read locks provide the necessary isolation guarantees.

Yeah, I don't agree with that yet because to me it isn't so much about "read consistency" but more about achieving exclusive ownership [for the transaction duration]. Maybe someone could explain how application code is supposed to achieve the functionality of blocking ownership of a single row, with the transaction losers either failing (NO WAIT) or waiting their turn with a timeout. How does application code achieve that? Is there an example?

My suspicion is that the option is looping and failing.

@rbygrave:

Detection at COMMIT time seems suggests it can't be 'caught' and handling by application code allowing the transaction to continue - the transaction must instead be retried.

Hmm, but isn't this also the case with detecting problems at statement-time? In Postgres, as far as I understand, if you encounter an error in your transaction, that transaction is unusable until you roll it back. Or is there a special case for OptimisticLockException that I haven't heard about?

Is there an example of what CockroachDB thinks optimistic locking is?

Optimistic concurrency control is an alternative to locking (aka pessimistic concurrency control). In pessimistic models, you acquire locks, and once you have your locks you will be allowed to complete (unless you acquire locks in different orders and end up failing to acquire a lock due to deadlock). In optimistic models, you proceed without locks and conflicts may be detected at a later time, causing the transaction to restart.

Yeah, I don't agree with that yet because to me it isn't so much about "read consistency" but more about achieving exclusive ownership [for the transaction duration].

"Ownership" of a row is not a concept CockroachDB supports. We guarantee transactional consistency (serializability) of all changes and may block or restart transactions to achieve that.

Maybe someone could explain how application code is supposed to achieve the functionality of blocking ownership of a single row, with the transaction losers either failing (NO WAIT) or waiting their turn with a timeout. How does application code achieve that? Is there an example?

We don't have the NO WAIT modifier, but in general if you UPDATE a row, transactions will be queued up in the way that I think you're expecting. We just don't let you grab that write lock earlier in the transaction with a SELECT FOR UPDATE. (and there are times where your transaction may be aborted even after grabbing a lock so you always have to be prepared for restarts).

Hmm, but isn't this also the case with detecting problems at statement-time?

No it isn't.

Optimistic concurrency control is an alternative to locking

Yes, I'm aware. There are different ways to effectively implement OCC which is where I believe the difference in thoughts is coming in.

In optimistic models, you proceed without locks and conflicts may be detected at a later time, causing the transaction to restart.

Yes that is one way to do OCC but it isn't the only way.

Any tool/library that uses a Version column for OCC is expecting to perform the OCC check at statement execution time. This includes JPA (@Version), Entity framework (rowversion), Rails Activerecord (lock_version). The client server tools that proceeded these like PowerBuilder, Delphi, Oracle Forms also did this. I would think that there are hundreds of tools and libraries that implement OCC using a "version column" (because this approach has been around for 30 years).

I realise Cockroach is looking to do OCC differently to what those tools expect (in the high contention case - which is what I tested). I believe that if Cockroach still had READ_COMMITTED then the tools/libraries using a version column for OCC would still work exactly the same in that high contention case.

That would mean the OptimisticLockException is something the application can catch and handle allowing the transaction to continue. Right now, CockroachDB will fail the transaction. Hmmm.

We don't have the NO WAIT modifier

Yes. My thinking is that where Application code needs the exclusive blocking nature of select for update it is going to have to simulate this with a application side wait loop.

Interesting stuff ...

Hey @rbygrave, I've closed this issue but I don't mean to shut down the conversation - we're still listening on this issue. I've opened a new issue to track improvements to our behavior that you can watch and comment on if you're interested.

Right. I see that it's closed as a no-op - ok.

I will add a bit here to address some of the points above that I skimped on previously. Apologies if this is already obvious and over sharing.

Hmm, but isn't this also the case with detecting problems at statement-time?

So no it isn't and that is because it uses OCC with a version column. For people unfamiliar with how that works, the application executes an update (or delete) that includes the version column in the where clause along with the primary key. Example:

update customer set name = ?,  version = ?   -- version bind value here is the "Next" version
where id = ? and version = ?           -- version bind value here is the "Expected" / "Current" version

If the version has changed, the update will modify 0 rows. Modifying 0 rows is the indication that the data has changed and a OptimisticLockException is thrown by the application (typically the underlying persistence library).

So OptimisticLockException in this way is thrown not by the database itself but instead by the persistence library / application when it detects the 0 rows modified. Hence it can be caught and handled by application code and the transaction is still good to continue using if required.

http://www.bailis.org/blog/understanding-weak-isolation-is-a-serious-problem/

What this blog post misses wrt READ_COMMITTED and is pointed out in the very first response is that in practice the majority of people using read committed isolation level also use OCC with a version column. This prevents lost update anomaly. That is, applications at READ_COMMITTED don't suffer from lost update anomaly because they use OCC with a version column.

In the tests I ran at READ_COMMITTED for Postgres and Oracle (above) the OptimisticLockException there is thrown by the persistence library. My test was indeed testing for the lost update anomaly at high contention with the 2 different isolation levels. For my test I could just change the isolation levels and run and confirm the behaviour is indeed what I was expecting at the 2 different isolation levels.

In the high contention case for OCC via version column we can note that the losing update must see the changes of the winning update for it to perform the occ check via version column. The losing update does see the change at READ_COMMITTED but not SERIALIZABLE. This is why when the same test is run at SERIALIZABLE the OCC via version column does not pick up the contention and Postgres, Oracle and Cockroach all throw their "failed to serialize transaction" type error.

Cheers, Rob.

Thanks for sharing! I see what you mean now - you're talking strictly about transactions that use weaker isolation levels than SERIALIZABLE. I agree that in those cases, and your example where the application developer understands these anomalies and carefully builds in protection against them, FOR UPDATE can be a real boon.

you're talking strictly about transactions that use weaker isolation levels than SERIALIZABLE

Ah, well not strictly no in that OCC via version column still applies and works at SERIALIZABLE in the _low contention cases_. That is, applications still need to detect that the thing/row being changed hasn't been changed by some other user/system in the meantime during "user think time".

It is just that in the _high contention case_ were we note that OCC via version column at SERIALIZABLE doesn't strictly detect the concurrency conflict and instead we get the different "failed to serialize transaction" exception instead.

It is likely that this only matters if the application has conditional logic along the lines of catching OptimisticLockException and continue processing maybe skipping some work it would otherwise do if OptimisticLockException wasn't thrown.

The conditional logic based on catching OptimisticLockException that used to work all the time at READ_COMMITTED will now occasionally fail at SERIALIZABLE when under high contention.

builds in protection against them, FOR UPDATE can be a real boon.

Well maybe I am misunderstanding your comment there but at READ_COMMITTED I'd say we need and use FOR UPDATE (pessimistic locking) for other reasons.

For example, blocking exclusive ownership when processing database migrations where 1 instance obtains the lock (and processes the migrations) other instances must WAIT/Block (until the migrations are complete).

For transactions that have conditional logic at SERIALIZABLE we could change to use FOR UPDATE (catch and handle the failure to obtain the lock and continue processing).

The conditional logic based on catching OptimisticLockException that used to work all the time at READ_COMMITTED will now occasionally fail at SERIALIZABLE when under high contention.

Ah, I think this is what we've been missing. You're talking about a sequence like this, right?

begin;
select version, otherfield from tbl where id=?;
update tbl set version=version+1, otherfield=? where id=? and version=?;
-- If rowCount == 0, goto select
commit;

So you're retrying within a single transaction and relying on the fact that reads are not repeatable. This will only work in READ COMMITTED (and isn't even guaranteed to work there - the standard allows unrepeatable reads but doesn't require them) and will fail in either REPEATABLE READ or SERIALIZABLE.

Instead of just looping back to the select statement when you get rowCount==0, you need to roll back the transaction and retry everything in a new transaction. Or you can avoid transactions completely, and do the select and the update completely independently of each other (the version check makes this safe without any transactions). Either of these approaches will work regardless of the isolation level.

Well maybe I am misunderstanding your comment there but at READ_COMMITTED I'd say we need and use FOR UPDATE (pessimistic locking) for other reasons.

Yes, at READ COMMITTED, FOR UPDATE is required. But CockroachDB does not support READ COMMITTED and has no plans to.

For example, blocking exclusive ownership when processing database migrations where 1 instance obtains the lock (and processes the migrations) other instances must WAIT/Block (until the migrations are complete).

This is the most common use case for SELECT FOR UPDATE that we see, and it doesn't work in CockroachDB. These migration tools will have to be modified to do locking in a different way (such as by actually writing to the database instead of just holding a lock in a transient transaction).

For transactions that have conditional logic at SERIALIZABLE we could change to use FOR UPDATE (catch and handle the failure to obtain the lock and continue processing).

Adding FOR UPDATE to the select in this version column pattern would avoid the need to restart the transaction, but it also means it's not optimistic any more - you'd be holding a write lock across the whole operation.

sequence like this, right?

Not really'ish. The select of data being updated can happen anytime before in a prior transaction (and normally would) and this isn't about retry per say. It is more like:

begin;
-- do work A
update tbl set version=version+1, otherfield=? where id=? and version=?;
-- do work B if update succeeded (no OptimisticLockException was thrown and caught) 
commit;

Where work A and work B are other updates, deletes etc. I should just put up the test in a git repo - I'll look to do that at some point but want to sort out some docker stuff first.

Note that if there is no "do work A" then we don't really care if the transaction fails (we don't care if the exception thrown is recoverable or not recoverable). It is only when we have other updates/deletes in the transaction that we care that the exception is recoverable (transaction can continue).

It maybe helps if I say that if instead we have a single update like:

begin;
update tbl set version=version+1, otherfield=? where id=? and version=?;
commit;

Then for me is a matter of treating OptimisticLockException (thrown when some other transaction update the same row and change the version value) and TransactionRetryWithProtoRefreshError: .. (thrown in the high contention case) as the same failure. Both fail the transaction, both require the user/app to refresh their data reapply the changes and resubmit the update (if appropriate).

With Postgres or Oracle at READ_COMMITTED we always get the OptimisticLockException. With CockroachDB we will sometimes now get the TransactionRetryWithProtoRefreshError ... but this doesn't matter if the application treats both exceptions in the same way - "Sorry, concurrent update, please refresh your data and reapply your changes".

So for myself, I'm looking to map TransactionRetryWithProtoRefreshError to be a special type of OptimisticLockException [maybe UnrecoverableOptimisticLockException].

you need to roll back the transaction and retry everything in a new transaction

Yes for the "loop and retry" scenario.

This is the most common use case for SELECT FOR UPDATE that we see

Yes. It is the "Other clients need to wait (are blocking)" part that is going to be the not so pretty aspect to deal with here. I should be doing this shortly so I'll know for sure soon enough.

Adding FOR UPDATE to the select

Yes agreed.

Apologies for consuming a lot of your time. This has been good for me, I'm getting pretty comfortable with some of the things I need to do.

As a more random thought. Instead of SELECT FOR UPDATE ... Cockroach could provide a completely different command like: LOCK ON [lockName] [NOWAIT] [TIMEOUT duration] ... for the cases people want blocking (where lockName is an arbitrary string).

Cheers, Rob.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

danhhz picture danhhz  路  3Comments

xudongzheng picture xudongzheng  路  3Comments

melskyzy picture melskyzy  路  3Comments

magaldima picture magaldima  路  3Comments

HeikoOnnebrink picture HeikoOnnebrink  路  4Comments