prisma 🚀 - API for long running transactions with dependencies between write-operations

👍 This seems like a very useful (if not essential) feature to me and has been the only real pain point for me so far as a new Prisma user.

As you've identified, there are situations where you need to perform several mutations in sequence, making use of the response data as you go, but with the confidence that it'll be a single transaction that rolls back in the case of a failure. In my opinion, neither nested mutations or transactions for independent mutations seem quite sufficient for these use-cases.

If it helps to have example use-cases for discussion, there's a little more detail about my specific requirement here: https://www.prisma.io/forum/t/transactions-when-the-business-logic-requires-more-than-just-a-nested-insert/5904

mattwilson1024 on 4 Feb 2019

👍16

@matthewhuang-camelot

Yes, raw sql queries are possible see: https://www.prisma.io/docs/prisma-graphql-api/reference/raw-database-access-qwe4/

pantharshit00 on 18 Feb 2019

@pantharshit00 Thanks! Just saw that too :D

matthewhuang-camelot on 18 Feb 2019

❤1

Hi guys, any updates on this? :(

savioserra on 11 May 2019

@savioserra

We are working on a new client which will have this.

Please check here: https://github.com/prisma/rfcs/blob/new-ts-client-rfc/text/0000-new-ts-client.md

pantharshit00 on 13 May 2019

👍1

@pantharshit00 Nice :) Gonna take a look. For now, I'm using stored procedures and Prisma's executeRaw mutation as a workaround.

savioserra on 13 May 2019

Hello People, any update on this yet? 😕

waleCloud on 26 Aug 2019

👍2

hi any update on this?

arvinlorenz on 9 Sep 2019

👍3

Will this be part of Prisma 2?

iamclaytonray on 16 Jan 2020

Looking into ORM's for Node, Prism looks REALLY good (website, documentation, activity on Github, etc). But lacking/missing (manual) ~~migrations~~ transactions is a HUGE no-go for me. I looked at Prisma 2's change-log and article. I only see something about nested ~~migrations~~ transactions, which is not the same (it can't replace manual transactions (grouping)).

edit: Oops, I said migrations in place of transaction :man_facepalming:

Ilyes512 on 1 Feb 2020

Quoting "Julien Goux" from this public Slack conversation:

I want to do business logic between my write operations within the same transaction.
Business logic could be anything from data validation to third party calls.
This is a really common case of server-side code
Typically I often open a transaction for my whole request lifecycle, with PostgreSQL checkpoints if I need more granular rules (edited)
I must also point that a transaction isn't all about write operations, it's essential being able to work with a set of data frozen at a certain time.
When you do :

const userExists = await db.users.find(newUser.id)
if (!userExists) {
  await db.users.insert(newUser)
}

outside of a transaction, this is unsafe because you can't reliably know if the users are the same set from when you did the select when you're inserting.
maybe this user was created between the 15ms that separate these two statements

nikolasburk on 4 May 2020

👍3

I want to respond to the use case from the comment above (with points made by @sorenbs), how to handle situations like this:

const userExists = await db.users.find(newUser.id)
if (!userExists) {
  await db.users.insert(newUser)
}

If the user was created in the 15 ms that separates the two statements, then the insert statement will fail as the id is unique. So there is no need for a transaction in this case.

Further, if the id isn’t unique, then using a transaction wouldn’t solve the problem. Transactions do not block the entire database, so unrelated queries are allowed to proceed concurrently. In this case, both of the inserts would be allowed to proceed because they are unrelated, even if a transaction was used.

nikolasburk on 4 May 2020

@Ilyes512,

could you describe your use case in detail?

sorenbs on 19 May 2020

Hello, can we expect this to land any time soon? Beta-9 for example? Thanks for info a your work :pray:

mrceperka on 1 Jun 2020

Further, if the id isn’t unique, then using a transaction wouldn’t solve the problem. Transactions do not block the entire database, so unrelated queries are allowed to proceed concurrently

Doesn't repeatable read / serializable isolation levels help in such cases?

maxmalov on 12 Jun 2020

Put it this way: without the ability to intermix database activity and application logic within the transaction scope, you risk the outcome becoming incoherent (ie: nonsense) data.

The point of the transaction scope is to allow the database to help ensure that what you think is happening actually ends up being correctly represented in the database.

It's really important that Prisma be able to do this.

It's well and good to be able to have a hierarchy of related data be added as part of the same transaction (as is supported now) via the 'create()'. And, being able to batch up a number of simple requests as a set, within the same transaction (eg: prisma.transction([write1, write2]). But, this last option probably prepares data to be written using data that could be changing underfoot of the code (not good, unless you can ensure there's only 1 process doing that at a time).

If you can not be sure that data you read at point in time A has not changed by the time you commit the changes, there is a very good probability that eventually, the data you have will be garbage because it is no longer consistent like it should be. Instead, there will be garbage data. Data that's there, but, how it relates to other data is not able to be established.

Having garbage in the database is something we generally want to avoid.

I suppose, it could be said, that this is kind of where the database having a procedural language available sprang from. eg: stored procedures with transactional scope. By allowing us to intermix database activity and application logic in the client application, we avoid having to resort to using stored procedures to achieve the desired result.

pbelbin on 17 Jun 2020

👍7

@pbelbin I personally find that having long running transactions encourages the behaviour you just described as bad. People fetch the data at the beginning, do some long operation with and external system and just save the whole object (overriding the data of other concurrent requests). The prisma.transaction is clearly needed because we sometimes need sequential DB operations (one common is to modify Many-to-Many relations where you delete existing relations and recreate new ones), but otherwise one should always use small safe UPDATE. I think having a native way to deal with increments for example is much more important than any long running transaction.

Sytten on 17 Jun 2020

👍1

@Sytten , I guess it depends on what you think a 'long-running-transaction' is. I agree that having a transaction that is open for hours at a time is not desirable. The longer the transaction is open, the higher the chances are that the transaction will have to be abandoned anyway, because someone else changed something. My concept of a 'long-running-transaction' is much shorter than that. Even if it's just a few seconds, we still need a way to guarantee that data read at point in time A is still valid and consistent by the time the end of the transaction happens. We need the ability to read, do some logic, write/update/delete, (repeat as needed) then commit.

pbelbin on 18 Jun 2020

👍6

I see that the Prisma team are adding .transaction([ {prisma action]) capability, which is good, but I think I still need more.

eg: Lets say you have a database where auto-generated IDs are being used, and you need to add a primary record, and, depending on what type of record it is (or, even better, whether the generated ID is odd or even), a bunch of related (by that ID) records, and it needs to be 'all or nothing'.

Because we don't know what the ID is until the database tells us, we are unable to prepare an array of actions using the ID.

If the primary table has a combination of columns that are unique, we can use a 'select' to dig up the ID, but if there is no real uniqueness, and we are unable to add something (eg: a UUID to create uniqueness where there is nothing else), then are we up the proverbial creek?

Not really anything to do with the .transaction([]), but, using nested creates for the related records is a possibility, as long as the nested inserts are able to be done successfully within the limitation that all nested actions are done in alphabetical order of the related model name. If that happens not to be possible, then this is not an option either.

At this stage, perhaps the only solution is to use a stored procedure? Most good database systems provide stored procedures, but SQLLite does not.

Suggestions?

pbelbin on 23 Jun 2020

👍2

Hello!
I really like Prisma and I really respect your work, but Prisma seems incomplete without the support of long-running transactions. Transactions are perhaps one of those important things, the absence of which doesn't allow many developers to use Prisma in serious projects!
I really hope that in the nearest you will get rid of this flaw.

vadim-shilov on 23 Jun 2020

👍3 ❤1

I'm kind of starting to wonder whether there is an underlying issue with how Prisma currently works that makes it impossible for long running transactions to be supported.

Could that be it?

pbelbin on 3 Jul 2020

👍1

Thats kind of the point I was trying to make for a while now. Its a client server architecture. Until the direct binding are supported i don't see how this could be done.

Sytten on 3 Jul 2020

So according to the roadmap https://www.notion.so/Prisma-public-roadmap-50766227b779464ab98899accb98295f this will not be implemented?

Does prisma offer any solution for following use case

read data
alter data (js logic or something)
write new data

Where I can be guaranteed that data in database has not changed between steps 1 and 3? Here making 2 separate queries without transaction and having some logic in the middle has potential risks of incorrect / outdated data being written to db?

A use case I can think of for this is something like game auction where item might become unavailable during say step 2, but will be awarded in step 3 anyway, hence we end up overwriting item owner while still deducting gold from original buyer.

IljaDaderko on 3 Jul 2020

@IljaDaderko Even with long running transaction you don't have any guarantee that the data will not be changed by another concurrent process. By default in postgres you get read committed isolation. What you want is a row level lock and that is another feature request. I support the decision to not support long running transactions because dev usually use it badly thinking that it will act as a lock.

Sytten on 3 Jul 2020

@Sytten ok, makes sense. I was under assumption long running transactions come with that lock.

IljaDaderko on 3 Jul 2020

Hey folks, it's nice that you think that you can get away without long running transactions in some instances, but, there are other instances where what Prisma currently provides does not provide the same capability and utility as long running transactions. And this is really hurting adoption of Prisma.

Consider this:

If there were no transactions in Prisma at all, and every request to the database was done in it's own auto-committed transaction scope, and you're adding a group of related records, there is no control over when other clients will see those records, and 'which' records exactly, since timing could mean you'd see only some of them.

This can be very bad, and it's one of the things using database provided transactions guards against.

ie: other clients that expect to see a complete set of related records will have to face the possibility that they will see incomplete data.

Now, granted, the new .transaction([]) goes some distance towards allowing the 'all or nothing' to be achieved, but, when it comes to needing to perform complex logic to determine what actually needs to go into the array, that's a different thing.

Case in point:

If you are converting an existing system that has lots of existing long running transactions in the source code, and this conversion is not being done 'all at once', and you have no ability to change the existing code, there is a very high probability that the system has also been designed such that if the related records are not present for any reason, bad things will result. Eg:
user interfaces reporting errors to users, or, batch processes creating incorrect accounting transactions worth real $$. So, you must provide equivalent behavior within the new code. That means: you have to have a way of doing all or nothing.

How to achieve it?

In some cases, you might have to turn existing long running transactions into a variation of 'saga' pattern instance, and hold off making any changes in the database until all the external interactions are done. In others, you might be able to do it 'all at once'.

Where using a persisted 'saga' is able to be avoided, and, using optimistic locking:

you have data that exists in a table, and the ID is being assigned by the database as a SERIAL (ie: the database issues the ID as records are added), and there is nothing that is unique on the rows being added, there are currently scenarios where the Prisma nested creates will not allow you to do everything that's needed. Why? Because all nested creates are being done in alphabetical order of the related Prisma model name. If that ordering doesn't happen to let you do what you need to do, you're in for a nasty surprise. Primarily this is because you will want to use 'connect' for related IDs of records that have not yet been added because they are on model names that sort later than the nested record you're currently trying to insert or change. When the 'connect' does a query to look up the ID, it fails to find the as yet not created record, and your insert fails as a result.
where there is something that is unique, you can probably get around it, from the standpoint of creating the set of Prisma actions that describe the needed changes, and wrapping them all up within a .transaction([]) request, but this does nothing to ensure that the set of data you read from the database while you were figuring out what to describe as a 'change' is still how it was when you were reading.

In both cases above, you have to manually do what the database does for us with optimistic locking transactions: detect when any of the records of interest within the transaction have changed or has been locked, and trigger a rollback when that happens. eg: log a unique fingerprint for those 'source' records (ie: something that will let you know that the records you read are still present and unchanged), and create additional queries to be added to those of the .transaction([]) usage, so that it verifies those records are still unchanged (or triggers a roll-back) in addition to the requests that implement the changes. This is very important, because, otherwise, you could be writing something based on reasons that were true, but are no longer true. ie: out of date data.

Today, Prisma does not provide any help doing this. But, I can imagine that perhaps this would not be very hard for Prisma to 'capture' this information, and 'auto generate' the set of needed re-validation queries, and performing them before the custom actions in the .transaction([]) usage. This would go some way towards alleviating this issue.

Of course, we can avoid needing to do this by using pessimistic locking for the duration from before the reads start until after the last write is done, but, the performance issues due to this approach essentially make this a non-starter in most cases. I am guessing that this also would be something Prisma could not manage to do today, in any case, for the same reasons Prisma seems unable to offer long running transactions. There seems to be no way to have the Client request of the Engine to create a long running session, and let the Client have multiple interactions with the Engine within that same connection with the SQL database.

Also:

If you look at the logging output from Prisma, it's clear that there are prepared statements happening in the Prisma Engine to achieve what the 'connect' functionality does.

When you have a .create that has nested child entities, Prisma is storing the parent ID, and using the metadata to understand how to magically plug that in for us, in the related child records.

Today, when the 'connect' is done, we have to pass in the unique values that let Prisma do a query to dig up the unique ID.

If Prisma Client also had a way to tell the Engine which previously obtained ID value should just be plugged in instead of having to query for it, this could go a long way towards providing a way to solve this issue, as it would allow this sort of thing to be kept within the transactional boundary that the Prisma Engine has with the database. It might also improve performance, because those select statements would not need to be done as much to repeatedly 'connect' to the same related record. Perhaps it could be by considering the graph like a file system, with each level able to be described like a file path, with '../../../id' meaning the ID that was obtained 3 levels above the current level. And, '/id' would mean the ID that was obtained at the very top of the hierarchy. eg: connectExisting: "../../id" I'm guessing it would not be impossible for the Engine to validate that the type of ID being referred to is the same as that needed for the connect.

The things I've described above are all improvements that would be usable within the idea that the Engine has the transaction, not the Client, and allow, within the scope of that transaction to provide functionality that's in some cases impossible to be done today from the Client side.

pbelbin on 4 Jul 2020

👍7

It also should be noted that MySQL's default for transaction isolation is REPEATABLE READ, and that this is not the case for PostgreSQL.

I think that Prisma should probably offer the developer the ability to configure this.

pbelbin on 5 Jul 2020

I don’t quite understand why it’s already in the documentation （ Long-running transactions ）but it doesn’t actually have this feature?

viweei on 5 Jul 2020

@viweei Maybe because it's described in the "Future transaction support in Prisma Client" section? 🤔

MichalLytek on 5 Jul 2020

@IljaDaderko Even with long running transaction you don't have any guarantee that the data will not be changed by another concurrent process. By default in postgres you get read committed isolation. What you want is a row level lock and that is another feature request. I support the decision to not support long running transactions because dev usually use it badly thinking that it will act as a lock.

I don't see this as a reason to not provide long running transactions. I see this as a problem that Prisma needs to address, so that the developer will have better understanding (documentation) and control (Prisma API improvement) so that the developer can choose what the isolation level they want is. In some cases, it's very important that we be able to have the degree of control, rather than having to design whole systems to work around Prisma.

pbelbin on 6 Jul 2020

👍14

Just chiming in here. The transaction([]) api does seem to be very limited and will not allow for easy code organization. Has an alternative of adding an optional last parameter to all prisma client methods been considered/discussed? E.g.:
const trx = prisma.createTransaction(); prisma.post.create({}, trx)
This pattern can be replicated if you are developing a re-usable module that performs a series of steps. Callers can make it part of a trx or not.

kokokenada on 9 Jul 2020

👀1

Just chiming in here. The transaction([]) api does seem to be very limited and will not allow for easy code organization. Has an alternative of adding an optional last parameter to all prisma client methods been considered/discussed? E.g.:
const trx = prisma.createTransaction();
prisma.post.create({}, trx)
This pattern can be replicated if you are developing a re-usable module that performs a series of steps. Callers can make it part of a trx or not.

@kokokenada ,

I like this idea. It has a very familiar feel to it.

Presumedly there would be a trx.commit() and trx.rollback() somewhere in there too, perhaps.

I may be mistaken, but I thought I read somewhere about there being a pool of database connections that the Engine is using. Surely it is possible to have the Engine grab a connection, begin the transaction, stash it in an 'underway transactions' collection, and re-use it when subsequent requests come along?

The 'other' idea that has been put forward is one of using a callback function which is enclosed within a transaction scope, and when callback function throws an exception (exiting the callback) it would trigger a rollback, and returning normally would signal a commit.

Having both options would be really nice, so there would be some flexibility in how developers write their code, versus being forced into one way of doing it.

pbelbin on 10 Jul 2020

Yeah - that's the idea. I used it thoroughly in a previous project that was based on Prisma 1.x. I ran into also sorts of issues, and ended up doing an integration between knex and prisma 1.x. I shared the idea here: https://github.com/Axon-VetCare-Technologies/knex-prisma. Prisma 2 takes care of most of the issues - but transactions still needs to be addressed.

kokokenada on 10 Jul 2020

@janpio , who (if not you) on the Prisma side of this can provide some insight as to what the situation is with Prisma and long running transaction support?

Clearly there is a demand for long running transactions to be supported in a more 'traditional' way than via what the current .transaction([]) provides.

It seems to me that there really is not a 'technical' issue with being able to support long running transactions. I say this because I gather the Engine has a pool of database connections, so there shouldn't be an issue with there being a set of connections that are involved in an ongoing transaction being referenced across multiple requests.

For some people, not having this capability is a deal-breaker, and they will have to look elsewhere.

That would be unfortunate though, as Prisma has many good things to offer otherwise.

It would be much better if Prisma supports long running transactions.

If there is reluctance regarding long running transaction support, it would be helpful to understand what the thinking behind this is.

pbelbin on 15 Jul 2020

👍8 👀1

Hey everyone, thank you all so much for the activity in this issue! The feedback, ideas and concerns you are sharing are absolutely valid and we want to make sure they're being addressed properly.

I want to share a few points from our side that might help better understand our thinking in this area!

⚠️ The following points stem from an internal writeup from @sorenbs with slight adjustments from myself ⚠️

Architectural constraints

Long-running transactions require holding a stateful connection open between two components for an extended period of time. This is not how modern scalable systems are built, and imposes constraints on performance and scalability of modern system design.

This is exemplified in the challenges developers who wish to build a high-scale application on AWS lambda that connects to a relational database such as Postgres or MySQL have. These developers find that they must introduce another component, a database proxy, to break apart the stateful connection between application and database, loosing the ability to run long-running transactions or introducing complex performance tuning as is the case with RDS Proxy described in the Avoiding Pinning section.

Prisma is designed for a future of stateless, highly scalable services connecting to stateful data stores. To optimise for that future, we want to be careful to not be architecturally constrained by design decisions of the past.

Misleading guarantees

Chapter 7 of the Designing Data-Intensive Applications book is laying out some very common misconceptions developers have about transactions and how the ambiguitiy of the individual ACID properties contributes to confusions and misunderstandings about the guarantees a transaction can actually provide.

However, in practice, one database’s implementation of ACID does not equal another’s implementation. For example, as we shall see, there is a lot of ambiguity around the meaning of isolation. The high-level idea is sound, but the devil is in the details. Today, when a system claims to be “ACID compliant,” it’s unclear what guarantees you can actually expect. ACID has unfortunately become mostly a marketing term.

A lot of the use cases we are seeing where developers would like to use transactions might be solved by better alternatives (see below).

Better alternatives

We think that for the vast majority of use cases, there are better ways to achieve a goal than a long running transaction. Here are a few tools that could be used as alternatives.

Atomic operators

Sometimes a multi-step interaction can be expressed as an atomic operation. For example, if you want to read a value, increment by one and then write it back, it would be better to instead use the atomic operator increment to perform both steps in a single transactional query. Prisma does not support this yet, but this issue is related and already on the roadmap.

Application level OCC (optimistic concurrency control)

If a value being written to the database was calculated from a value previously read from the database, you can make the write conditional on the previously read data to not have changed. Prisma does not support this yet, but this specs issue is related.

What do you think of this approach as an alternative? Would this cover your use cases?

Handling violations out of band

Banks are often used as an example of applications that require strong transactional guarantees. This is a misconception. For banks, reconciling transactions is their entire business, so they handle this in their application domain rather than outsourcing it to the database. As a concrete example, many bank customers are able to withdraw more money from their account than their overdraft allows. This is possible because the atm machine does not hold an open transaction on a central database while dispensing cash. If you were to clone your credit card and enlist your friends to withdraw $100 from 10.000 ATMs across the country you would end up with a lot of cash, a huge overdraft and an angry call from your bank.

Serialising operations

An often overlooked but sometimes very effective strategy is to intentionally reduce concurrency to 1. This can be achieved by scheduling all operations on a queue to be processed by a single worker. By eliminating the MVCC overhead in the database, it is possible to scale this single worker approach to tens of thousands of transactions per second, and not having to worry about concurrency can greatly simplify the application logic.

As I mentioned, this is a first perspective on our internal thoughts. I'd love to hear your feedback about this and whether you think the alternative approaches listed above help mitigate your concerns!

nikolasburk on 16 Jul 2020

👍8 👎1

Hey @nikolasburk,

Thank you for the information.

I understand and agree, that there are many scenarios where it's OK to work around not having long running transactions.

However, there are scenarios where it is required, and, to ease adoption, would at least, be hugely beneficial.

As I have said, there are some scenarios where it's just mandatory, and that's going to drive people away from Prisma. Surely you don't want that.

Where there are long running transactions in use today, the expected outcome is that regardless of how much or little time the long running transaction takes, either all the changes will be visible after the commit is done, or none will be visible to other clients.

Today, 'all or nothing' is simply not possible with Prisma in all cases, especially when you consider a system that's being converted to use Prisma, and the existing database design is 'frozen'. It may not be safe or permitted to change the design to favor using optimistic locking, which could then open the way to breaking up the long running transaction into discrete chunks (which will cause incomplete/partial data to be observable by other clients until all chunks are completed), and, if something does prevent completion, the client code is now holding the bag regarding how to revert back the data that had successfully been changed up to the point of failure. If those reversions involve restoring data that has been deleted, that's a huge can of worms that frankly, the database is probably the best positioned to deal with, without some database and application level changes (which are 'out of bounds' for apps that are being migrated, in most cases). And, if the client dies at some 'incomplete' stage anywhere along the way, now there's another scenario to have to account for.

That we do not have any control over the isolation level that's in force within the transaction is another issue that should be dealt with so that there is control and clarity around this. The default isolation level between MySQL and PostgreSQL are different, and there's no apparent way to ensure that there is consistency, let alone the option to set what it should be.

While not directly related to long running transactions in all cases, but, something which may make avoiding long running transactions possible in some cases:

The fact that currently, adding multiple related nested items is always handled in alphabetical order of the related model names, creates a situation where we can not get the records inserted in the order they need to be, within a single request.

I have also suggested adding being able to reference generated IDs (using a relative or absolute path type descriptor) that Prisma Engine must be holding while the nested request is being processed. This single improvement would really help in some instances, but not others.

While it would be nice if all Prisma development efforts are 'green field' and we are able to design the system around concepts that have emerged as supporting huge scale database activity, for many applications, especially those that are being converted towards use of Prisma, that level of scale is not always required, and it's not always possible to re-engineer everything in the existing application to work well with Prisma's preferred approaches. Prisma needs to support these 'less performant' approaches so that at least we can make the conversions happen, and then consider the benefits (and costs) of moving the applications towards some of the more performant approaches, when the time is right, and, we have the ability to affect the db design, and the other client code that might be impacted as a result.

Without supporting this migration path, what I'm hearing is: 'don't use Prisma for existing system migrations'. I don't believe that's the intention, but, that will be the outcome, without the flexibility that having long running transaction support provides.

Hopefully this provides another angle requests for support of long running transactions can be viewed from.

pbelbin on 16 Jul 2020

👍19

Just to give you my two cents worth. I agree with @pbelbin 's comments. From my perspective, the problem is not so much supporting long running transactions but giving people the ability to organize their code into composable units.

For example, image I have a module that handles sending messages. As well as talking to the SMS and mail service, it also updates a message and message-event table and exposes a nice set of public functions for other parts of the system to use. Imagine another module for creating an order, which updates another set of tables. And now I have some feature that creates orders and sends related messages, and I want to leverage transactions to cleanly manage failures and rely on transactional all or nothing behaviour.

With the proposed Prisma API, you would have to contort your code to fit the Prisma transaction model in order to be composable (e.g. always return a Prisma promise for every method you want to be composable). No other options. (Or am I missing something?)

A more flexible API is required. E.g. create a transaction, commit or roll back a transaction (with nesting support) and any read or write to the database can optionally take a transaction object. Another example, with a different approach, is Knex.

kokokenada on 20 Jul 2020

👍7

If the user was created in the 15 ms that separates the two statements, then the insert statement will fail as the id is unique. So there is no need for a transaction in this case.

IF, however, when the system sees that there is an existing user record, and based on that, there is an expectation that there will also be a set of related records, and, without those records will fail to behave as expected, now we're talking about the need for a transaction. This is the thing about the 'all or nothing'ness of transactions. IF the records can be created within a single nested request, that's great. IF the records can be created within a .transaction([]), that's great too. But, IF either of the preceding options are not possible, then having a long running transaction within which all the related records can be created and exposed to the rest of the system together is needed. The application may simply be too sensitive to the possibility of encountering scenarios where expected data is actually not present 'yet'.

eg: any query that's using INNER JOIN is going to fail to return a result without all the expected records in place.

pbelbin on 4 Aug 2020

@matthewhuang-camelot

Yes, raw sql queries are possible see: https://www.prisma.io/docs/prisma-graphql-api/reference/raw-database-access-qwe4/

But, right now, you can not use them inside .$transaction([]).

https://github.com/prisma/prisma/issues/3150

pbelbin on 6 Aug 2020

Hey @nikolasburk , when do you think we might see some more action on this front?

The desire that Prisma add support for long running transactions has been indicated fairly strongly over a number of issues related to the topic.

Even though recent comments are relatively few, that doesn't change the fact that many people have indicated it's needed, and, for a long time, it appears, it was advertised as going to be supported with Prisma 2.

Judging by some comments, it appears that this has definitely been a show stopper for a number of people.

I understand there is some 'reluctance' to provide this functionality, but, it's clear that there is a need.

Please advise.

pbelbin on 13 Aug 2020

👍4

Hi,
We heard the request for this solution loud and clear.

Our approach

We would like to get to the bottom of the individual use cases or problems where long-running transaction (LRT) feel needed, so we can identify solutions which may not actually require them, and hence ensure that Prisma can indeed be adopted, but also make it less prone to inadvertently “tricking” people into practices which would push them away from running their applications on modern architectures (such as serverless functions, etc.)
While we want to incentivize the use of best practices, we also don’t want to block anyone who indeed can’t reasonably do without LRT.
We acknowledge that some use cases exist, and will consider ways to solve that problem. The problem is our capacity and timing.

Priorities and timelines

LRT would be indeed complex for us to implement and thus require to make hard choices.
In terms of priorities, we are focusing first on supporting the use cases which can be solved through our API. This, added to the rest of the features and improvements we’re doing on migrate, requires us to push any work regarding LRT to a later time, and we want to be clear about expectations in that regard. This thus requires to rely on workarounds for the time being.

Use cases

So far, we identified the following cases:

transaction([]): it is available in preview, and welcoming feedback or more input as issues to improve it.
Atomic operations: work has been actively started on that topic and we hope to land something in the coming ~1-2 months (cf. https://github.com/prisma/prisma-client-js/issues/775)
Optimistic concurrency control: something which could help alleviate what @IljaDaderko mentioned. We will plan this so it can land in the coming months as well. We will create an issue and take it into account in the upcoming planning.
Managing auto-generated IDs: this is not supported, but would definitely be something we could provide a solution for within our API. This could be great to create an issue for this one so we can also prioritize it.
Lock management: some databases (like Postgres) do support various levels of locking which provide more control over the guarantee given on the data integrity (SELECT ... FOR UPDATE, or FOR SHARE). We should create an issue for this case as this could be an option to evaluate as well.
The need to migrate an existing application which heavily relies on LRT: this is indeed a place where there would be a genuine need for a solution. Here, we could be thinking of making a clear list of possible workarounds allowing that to happen.
One is to set up a separate connection to the DB using a native client to deal with the un-migrated code: while suboptimal, this can be done today. The second one would be for us to support another raw method allowing users to execute multiple SQL statements in one go. We are considering this as part of improvements to the raw methods.

If you see anything not falling in these categories, feel free to open an issue and mention it here.

Note: it is mentioned that nested writes are done in alphabetical order. This is not the case, these are done by the API in an order which allows respecting the constraints of the model. If they happen to be done to an alphabetical, that is a coincidence. However, if you find something wrong with the behavior, feel free to open another issue describing the situation or need in detail.

As a final note, I am open and eager to jump on a call to gather more in-depth context about these use cases. If anyone wants to chat, feel free to book some time together or DM me on our Slack community - (at)Hervé

thebiglabasky on 14 Aug 2020

👀6 👍6 ❤3

I would add modularity of code as a use case. This is not necessarily long running, but moderately complex business logic. I may have different code modules where I isolate business logic and >1 of these need to be part of a transaction. It's really awkward to force all database calls to happen at one place and time. Even a transformation operation with various checks and loops will naturally run into a situation where you get part way through it, detect an error condition and want to role back the whole thing. (I.e. it should behave like every other transaction system does). If you're concerned about developers shooting themselves in the foot by over using transactions, in the docs say that transaction([]) is preferred for performance and tx.user.create() or prisma.user.create({}, tx) or whatever has potential downsides and include the resources listed above. Some applications (e.g. simple logic, high volume) would be best to choose transaction([]). On the other hand, a system with a smaller volume but very complex business logic would certainly be better off with prisma.user.create({}, tx) as you are optimizing for code readability and maintainability opposed to efficient transaction volume.

kokokenada on 28 Aug 2020

👍8

I opened an issue back in mid-july regarding needing to have a way to control the order in which the nested creates are done, which is where I mentioned that the nested creates appear to be done in alphabetical order.

https://github.com/prisma/prisma/issues/2793

pbelbin on 28 Aug 2020

I had a very nice conversation with @thebiglabasky.

It's clear that the providing of real long-running-transaction capability is on the distant horizon, and that as much as possible, there is a hope that workarounds will appease the community until they can bring resources to bear to really address this issue in a way that will deliver the capability that was talked about as being part of Prisma 2.

We have yet to see the full list of workarounds that the Prisma team think might be able to fill the gap.

There certainly are cases where the desired outcome can be achieved without transactions, but, there are others that can not.

If you have an interest in long running transaction support being brought to Prisma 2 sooner rather than later, I suggest that you make contact with them, as invited above, and/or re-register your interest in long running transactions being brought higher up on the priority list by re-asserting your interest in this by adding as many thunbs-up (+1) responses to these recent comments as possible.

pbelbin on 28 Aug 2020

👍28 ❤7

I think Prisma could implement a mechanism similar to Firebase transactions by making use of optimistic concurrency control. Entity Framework Core, for example, has a concept called Concurrency Tokens which is available for different SQL databases, including PostgreSQL. Prisma could add a version column, for example, which it treats as a concurrency token, and which it increments automatically on each update. Updating the row fails with a concurrency exception if the concurrency token changed between reading and writing the row.

prisma.startTransaction((transaction) => {
  // entity has a hidden `version` property which prisma tracks as soon as it is read
  let entity = transaction.get(id);
  entity.foo = entity.foo + 1;
  // During the update operation, Prisma starts the actual database transaction. Prisma then reads
  // the entity `version` again and checks that it did not change. The second read inside the database
  // transaction is probably necessary for the concurrency token to take effect. Finally, Prisma
  // performs the write. If a concurrency exception occurs, then the whole process restarts by calling
  // the startTransaction callback again.
  transaction.update(entity);
});

If, on the other hand, Prisma exposes a means to implement such concurrency mechanisms, then the transaction mechanism could be implemented outside of Prisma, and in a less generic fashion.

drizzd on 2 Sep 2020

👍5

@drizzd AFAIK, you can even avoid the second read by using update ... where version = x (x being the expected version in the database) and checking the number of modified rows that is expected to be 1 and fail otherwise.

However, I'd be careful with repeating the transaction automatically by default. IMHO, this should be some opt-in behavior. Otherwise, you might not notice concurrency issues for a long time and suffer a performance penalty instead.

mklinke on 2 Sep 2020

👍1

I know this comment doesn't help but I literally ran into this use case and I'm completely sad Prisma doesn't support long running transactions.

jamesguan on 25 Nov 2020

Can you share your use case in detail with us @jamesguan?
We're currently looking at solutions, and anything very concrete helps staging it properly.
Thanks!

thebiglabasky on 25 Nov 2020

@thebiglabasky Here's a pain point I ran into: postgresql's row security policies. But I'm not sure if an transaction api is what I need. I wonder if someone has any ideas on how to tackle this:

Here I have a policy that will only let the prisma client select rows where the id from the api request matches createdBy.

ALTER TABLE "Thing" ENABLE ROW LEVEL SECURITY;

CREATE POLICY "ThingPolicy" ON "Thing" TO api_role
    USING ("Thing"."createdBy" = current_setting('api.userId'));

Of course you could do this in the application using where agrs too. It's just that I prefer to enforce this type of logic in the database.

Currently, whenever I receive a request, I have to create a client instance in order to set the userId for that request.

const db = new PrismaClient()

await db.$executeRaw(`SET api.userId = ${userId}`)

const result = await db.thing.findMany()

await db.$disconnect()

Unfortunately this seems to obliterate the performance when there's large spikes in traffic:

┌─────────┬────────┬────────┬────────┬────────┬───────────┬──────────┬────────┐
│ Stat    │ 2.5%   │ 50%    │ 97.5%  │ 99%    │ Avg       │ Stdev    │ Max    │
├─────────┼────────┼────────┼────────┼────────┼───────────┼──────────┼────────┤
│ Latency │ 176 ms │ 231 ms │ 336 ms │ 362 ms │ 240.55 ms │ 42.52 ms │ 412 ms │
└─────────┴────────┴────────┴────────┴────────┴───────────┴──────────┴────────┘

compared to a single, shared prisma client instance but without row security

┌─────────┬──────┬──────┬───────┬───────┬─────────┬─────────┬───────┐
│ Stat    │ 2.5% │ 50%  │ 97.5% │ 99%   │ Avg     │ Stdev   │ Max   │
├─────────┼──────┼──────┼───────┼───────┼─────────┼─────────┼───────┤
│ Latency │ 1 ms │ 3 ms │ 9 ms  │ 13 ms │ 3.56 ms │ 2.45 ms │ 48 ms │
└─────────┴──────┴──────┴───────┴───────┴─────────┴─────────┴───────┘

I wonder if this could be fixed using an transaction api in order to achieve this sql:

BEGIN;
SET LOCAL api.userId = ${userId}; -- takes effect for only the current transaction
${query};
COMMIT;

Something like @kokokenada's suggestion would be perfect for this.

joaopaulobdac on 2 Dec 2020

Hey @joaopaulobdac, this is a very interesting problem. At first glance, it seems like this might be an issue with Postgres, rather than an issue with Prisma.

Maybe SET api.userId = 10 is slow. Could you also try this benchmark using https://github.com/brianc/node-postgres?

matthewmueller on 2 Dec 2020

@matthewmueller Just tried it with node-postgres, setting api.userId inside a transaction with the query:

import { Client } from 'pg'

const pg = new Client()

async function query(userId) {
    const result = await pg.query(`
BEGIN;
SET LOCAL api.userId = ${userId};
SELECT * FROM "Thing";
COMMIT;
`)

    return result
}

Performance got even a little bit better than a single prisma client instance with no row security:

┌─────────┬──────┬──────┬───────┬──────┬─────────┬─────────┬───────┐
│ Stat    │ 2.5% │ 50%  │ 97.5% │ 99%  │ Avg     │ Stdev   │ Max   │
├─────────┼──────┼──────┼───────┼──────┼─────────┼─────────┼───────┤
│ Latency │ 0 ms │ 1 ms │ 2 ms  │ 3 ms │ 0.77 ms │ 0.95 ms │ 42 ms │
└─────────┴──────┴──────┴───────┴──────┴─────────┴─────────┴───────┘

It seems like it's not a postgres issue.

joaopaulobdac on 2 Dec 2020

It seems that the issue experienced by @joaopaulobdac is that currently the only way to use Prisma to set a temporary configuration is to set it for the session (SET defaults to SET SESSION), which means that the configuration is set for the lifetime of the connection. By then creating a new connection for each request (by starting a new Prisma Client) you are achieving isolation. This is slow because setting up a connection is slow, and there are only a limited number of connections available.

The proper solution is to set the temporary configuration for just the duration of a single transaction using SET LOCAL, enabling you to reuse one connection and save the overhead of establishing connections for each request.

Prisma could support this by allowing $executeRaw in a $transaction call. Since Prisma 2.10, this is actually supported, so you should be able to do something like this:

db.$transaction([
  db.$executeRaw`SET LOCAL api.userId = ${userId};`,
  db.thing.findMany()
])

@matthewmueller - is it actually possible to return values from a prisma.$transaction call?

sorenbs on 8 Dec 2020

@sorenbs I tried that but it doesn't seem to work because the query can't find the set parameter. I get a QueryError unrecognized configuration parameter \"api.userId\". When looking at the prisma client logs, even though I put the $executeRaw first in the array like you showed, it ran the query first. It still runs the query first if I switch their positions in the array. I don't know why.

joaopaulobdac on 8 Dec 2020

That's really interesting - thanks for investigating @joaopaulobdac!

Maybe the ordering of elements in the $transaction call is undefined. Do you know @matthewmueller? It would be good to get that documented either way as that section is pretty sparse.

It could also be that there is a bug in the way the transaction is constructed. Did you try to enable Query Logging to see what SQL query is being generated?

sorenbs on 8 Dec 2020

prisma:info  Starting a postgresql pool with 1 connections.
prisma:info  Started http server on http+unix:///tmp/prisma-8f1b0f9f547f9bac8e0c3694.sock
prisma:query SELECT "public"."Thing"."id", "public"."Thing"."name" FROM "public"."Thing" WHERE 1=1 OFFSET $1
prisma:query SET LOCAL api.userId = user1
PrismaClientUnknownRequestError2: 
Invalid `prisma.thing.findMany()` invocation:


  Error occurred during query execution:
ConnectorError(ConnectorError { user_facing_error: None, kind: QueryError(Error { kind: Db, cause: Some(DbError { severity: "ERROR", parsed_severity: Some(Error), code: SqlState("42704"), message: "unrecognized configuration parameter \"api.userId\"", detail: None, hint: None, position: None, where_: None, schema: None, table: None, column: None, datatype: None, constraint: None, file: Some("guc.c"), line: Some(8965), routine: Some("GetConfigOptionByName") }) }) })
    at PrismaClientFetcher.request (/home/jp/.vscode/projects/temp/row-policies/node_modules/@prisma/client/runtime/index.js:79359:15)
    at processTicksAndRejections (node:internal/process/task_queues:93:5)
    at async Promise.all (index 0)
    at Server.<anonymous> (/home/jp/.vscode/projects/temp/row-policies/src/main.ts:79:25) {
  clientVersion: '2.12.1'
}

Notice how the error is at Promise.all (index 0). I also don't know if the client logs BEGIN and COMMIT so it's not clear if these calls are actually being run in a transaction together

joaopaulobdac on 8 Dec 2020

The fact that the SET query got logged at all makes me think that they are only being run in parallel, not in a transaction

joaopaulobdac on 8 Dec 2020

Thanks!

Would you mind opening a separate issue for this (it's probably a bug) and link it from here. That way we can get it resolved asap and get you unblocked.

sorenbs on 8 Dec 2020

👍1

Prisma: API for long running transactions with dependencies between write-operations

Most helpful comment

All 57 comments

Architectural constraints

Misleading guarantees

Better alternatives

Atomic operators

Application level OCC (optimistic concurrency control)

Handling violations out of band

Serialising operations

Our approach

Priorities and timelines

Use cases

Related issues