Loopback-next: Atomic "find or create" Repository API

Created on 1 Nov 2018 · 9Comments · Source: strongloop/loopback-next

While implementing hasOne relation in https://github.com/strongloop/loopback-next/pull/1879, we discovered a requirement for an atomic "find or create" operation that's needed to avoid race conditions in hasOne's implementation of create operation.

Cross-posting https://github.com/strongloop/loopback-next/pull/1879#discussion_r227363386

It is crucial to leverage an atomic implementation of findOrCreate provided by our connectors. The current proposal is prone to race conditions, where to instances of the target "hasOne" model can be created.

Consider the following scenario: the LB4 server is handling two incoming HTTP requests to create a hasOne target and the code is executed in the following way by Node.js runtime:

Request 1 arrives, DefaultHasOneRepository#create is invoked
targetRepository.find is called. It's an async function so other stuff happens while the query is in progress.
Request 2 arrives, DefaultHasOneRepository#create is invoked.
targetRepository.find is called. It's an async function so other stuff happens while the query is in progress.
targetRepository.find for Request 1 returns, no model was found. targetRepository.create in called.
targetRepository.find for Request 2 returns, no model was found. targetRepository.create in called.
Both create requests eventually finish. We have two models that are a target of hasOne relation. 💥 💣 💥

Acceptance criteria

[ ] Add a new Repository-like interface describing atomic operations.

export interface FindOrCreateResult<T extends Entity> {
 entity: T;
 found: boolean;
}

export interface AtomicCrudRepository<T extends Entity, ID>
   extends EntityCrudRepository<T, ID> {
    findOrCreate(
     filter: Filter<T>,
     entity: DataObject<T>,
     options?: Options,
   ): Promise<FindOrCreateResult>
}

[ ] Add a new repository class implementing this new interface while extending DefaultCrudRepository. Either the constructor or the findOrCreate method should verify that the connector provides atomic implementation of findOrCreate and throw an error with a machine-readable code if it does not.
[ ] Modify the table mapping known error codes to HTTP status codes to map the new error code to 501 Not Implemented.

Nice to have but not required:

[ ] A new repository class implementing AtomicCrudRepository on top of CrudRepositoryImpl.

Out of scope

Support for other atomic operations like patchOrCreate, replaceOrCreate, etc.

Repository feature stale

Source

bajtos

Most helpful comment

Few thoughts:

(1)
In LoopBack 3.x, when a connector did not implement atomic findOrCreate, the DataAccessObject provided a non-atomic fallback implementation callingfindandcreate`. While this may work well when building prototypes, it's creates a high risk of data corruption when deployed to production.

For LB4 and beyond, I am proposing to provide strong guarantees about robustness of this operation. If the connector and/or the database does not provide a solution that's atomic, then the repository method should throw a "Not Implemented" error (instead of falling-back to a non-atomic version).

If there is enough user demand, then we can provide a different repository class that implements operations like findOrCreate in an unsafe (non-atomic) way.

(2)
In order to allow repository consumers to better express whether they require atomic operations or not, and also in line with _Relation refactoring: Split Repository interface into different interfaces #1356_, I am proposing to introduce a new repository interface and an new implementation class.

For example:

interface AtomicCrudRepository<T extends Entity, ID> 
  extends EntityCrudRepository<T, ID> {

  // declare atomic operations like findOrCreate
}

class DefaultAtomicCrudRepository<T extends Entity, ID> 
  extends EntityCrudRepository<T, ID> 
  implements AtomicCrudRepository<T, ID> {

  // implement atomic operations like findOrCreate
}

/cc @strongloop/loopback-maintainers thoughts?

bajtos on 1 Nov 2018

👍5

All 9 comments

Few thoughts:

If there is enough user demand, then we can provide a different repository class that implements operations like findOrCreate in an unsafe (non-atomic) way.

For example:

interface AtomicCrudRepository<T extends Entity, ID> 
  extends EntityCrudRepository<T, ID> {

  // declare atomic operations like findOrCreate
}

class DefaultAtomicCrudRepository<T extends Entity, ID> 
  extends EntityCrudRepository<T, ID> 
  implements AtomicCrudRepository<T, ID> {

  // implement atomic operations like findOrCreate
}

/cc @strongloop/loopback-maintainers thoughts?

bajtos on 1 Nov 2018

👍5

I think the issue is more complex than just implementing the atomic findOrCreate. It depends on how hasOne is enforced:

By LoopBack using atomic findOrCreate from the connector
By the database with a unique constraint of the FK, for example, a user has a profile and the profile model has a unique index of userId. This can be potentially automated by automigrate.
For relational databases that don't have atomic findOrCreate, can we use transactional find then create with strong isolation level?

My understanding is that we only have mongodb and memory connector support atomic findOrCreate. If that's the default behavior, hasOne create will fail for most databases.

Maybe we have to introduce different options for hasOne.create method to honor users' policies, such as:

byAtomicFindCreate
byDatabase
...

raymondfeng on 6 Nov 2018

👍1

based on Raymond's comment, Cloudant/Couchdb2 have a similar situation as sql databases.

jannyHou on 6 Nov 2018

I think the issue is more complex than just implementing the atomic findOrCreate. It depends on how hasOne is enforced.

Good point!

This is loosely related to my ideas related to validation, see https://github.com/strongloop/loopback-next/issues/1872#issuecomment-435025909

By LoopBack using atomic findOrCreate from the connector

:+1:

By the database with a unique constraint of the FK, for example, a user has a profile and the profile model has a unique index of userId. This can be potentially automated by automigrate.

Makes sense. To make this option easy to use, I would like our implementation to have the following traits:

The database-specific error reporting violating of unique constraint is converted into a well-known LoopBack error code, so that the errors returned by REST APIs are consistent and independent on the underlying database.
The application/framework should check the database schema to verify the presence of the correct unique constraint. We should not blindly rely on the user to correctly configure the constraints. I see this as very important for any users that have multiple environments, where it's too easy to forget to add unique constraint to test/staging/production database. This would be of a lesser concern if we had a good database migration framework in place (see https://github.com/strongloop/loopback-next/issues/487).

For relational databases that don't have atomic findOrCreate, can we use transactional find then create with strong isolation level?

Makes sense to me. In the current design, it's up to the connector how it implements findOrCreate. Opening a new transaction and doing two-step find + create is a perfectly valid solution IMO. A possible caveat I see: how to handle the case when findOrCreate is already executed within an (outer) transaction. For example, MySQL does not seem to support nested transactions.

My understanding is that we only have mongodb and memory connector support atomic findOrCreate. If that's the default behavior, hasOne create will fail for most databases.
based on Raymond's comment, Cloudant/Couchdb2 have a similar situation as sql databases.

Ouch! When I proposed to use findOrCreate for hasMany, I has incorrectly assumed that most of our connectors already support this operation in an atomic way. Mea culpa.

Maybe we have to introduce different options for hasOne.create method to honor users' policies, such as:

byAtomicFindCreate

byDatabase

I don't think it's a good idea to ask the users to decide this matter. In my experience, many LB users don't fully understand ramifications of different ways of enforcing has-one constraint and as a result, we can end up with many LB4 applications prone to race conditions.

What I would like to see instead, is a single Model/Repository-level API that allows:

the HasMany relation to express the has-one constraint
the connectors to provide a database-specific implementation enforcing the constraint in a reliable (atomic) way

An interesting approach for document databases like Cloudant/Couchdb, based on this StackOverflow discussion:

When a User has one Credentials instance, we can encode the user's id in credentials' id. Even better, Credentials id can be set to the same value as the User id it belongs to. Now

In that setup, when we attempt a PUT request of an existing document id (findOrCreate, the document already exists), it will be rejected with a HTTP 409 error - unless you supply the correct revision (_rev property) of the existing document.

It makes me wonder - would it make sense to merge the primary key with the foreign key into the same property in the target model of HasOne relation and use this as the solution to enforce uniqueness across all databases?

@model()
class User {
  @hasOne(() => Credentials)
  credentials: Credentials;
}

@model
class Credentials {
  @belongsTo(() => User)
  @property({id: true})
  userId: string;
}

Or even

@model()
class User {
  @hasOne(() => Credentials, {keyTo: 'id'})
  credentials: Credentials;
}

@model
class Credentials {
  @belongsTo(() => User)
  @property({id: true})
  id: string;
}

Thoughts?

bajtos on 8 Nov 2018

@b-admike and me have discussed this matter yesterday. We concluded there are two ways forward:

Look into transaction-based implementation of findOrCreate
Require the target of hasOne relation to use the foreign key as the primary key. I believe the unique constraint of primary keys is enforced by all databases (SQL and NoSQL), thus not atomic operation like findOrCreate is needed.

bajtos on 9 Nov 2018

@b-admike, per our discussion last week, IIUC since this feature is only applicable for a small subset of connectors (or databases), this task is not a blocker for the hasOne task.
If this should not be part of the Nov milestone or top of backlog, please update accordingly. Thanks.

dhmlau on 26 Nov 2018

👍1

Yes @dhmlau per https://github.com/strongloop/loopback-next/issues/1956#issuecomment-437360566, we've decided to take approach 2) with #2005, and decided to close #1974 as rejected. I've removed this issue from the milestone.

b-admike on 27 Nov 2018

This issue has been marked stale because it has not seen activity within six months. If you believe this to be in error, please contact one of the code owners, listed in the CODEOWNERS file at the top-level of this repository. This issue will be closed within 30 days of being stale.

stale[bot] on 7 Feb 2020

This issue has been closed due to continued inactivity. Thank you for your understanding. If you believe this to be in error, please contact one of the code owners, listed in the CODEOWNERS file at the top-level of this repository.

stale[bot] on 8 Mar 2020

Was this page helpful?

0 / 5 - 0 ratings