Substrate: Runtime worker threads

Created on 16 Jan 2019 · 17Comments · Source: paritytech/substrate

The runtime is currently single threaded, which means all transactions/extrinsics must be executed in series. This is rather suboptimal. Instead, worker "threads" should be allowed in the runtime environment.

Since this is a consensus computation and the need for determinism is absolute, there should be a number of restrictions on worker threads:

They should be pure functions. This implies that as long as there is at least one worker thread, storage may not be altered (though it may be read). Furthermore, there should be no other kinds of data-sharing between threads, even within Sync safeguards like Mutex or Atomic. Separate memory slabs should be used if needed.
Worker threads may only be created from the main thread and have a specific identifier that is deterministic based upon the order in which they were created by the main thread.

If, while there is at least one worker thread, storage is written to, then there are two options that we should do one of:

panic; or
have set_storage buffer the alteration in a (thread-local) hash map, which ultimately takes effect once all worker threads are finished and in the order that worker threads were created (and thus deterministic).

Once this exists, a new issue should be created to introduce extrinsic parallelism whereby the block author prefixes the extrinsic set with an extrinsic to specify a dependency graph stating which extrinsics are mutually independent and can be executed in parallel. The runtime executive can then use this information together with a number of worker threads in order to safely parallelise extrinsic queue execution.

I8-enhancement 🎁 Q9-epic

Source

gavofyork

👀2

Most helpful comment

I'll add sort of fork implementation prototype soonish we can iterate on :)

NikVolf on 16 Mar 2020

👍2

All 17 comments

Can we simply not expose the functions that write storage to the worker, and require that the worker have a separate wasm blob a la WebWorkers? If we did that, attempts to write to storage from a worker would be caught at load-time.

Demi-Marie on 23 Apr 2019

gavofyork assigned kianenigma 11 days ago

Sweet; as soon as I am back (Jan): finish offcahin phragmen and get started with this. Looking forward 👍

kianenigma on 9 Dec 2019

Some more additional notes from the latest company retreat, in which we briefly discussed this:

A serialization approach should be adopted from the academia or other resources, or developed by us, to determine which transactions within a block can be executed with no write conflicts. This will most likely need to be a function of the __write-list__ of all transactions. I think this is a good start: https://link.springer.com/chapter/10.1007/978-3-319-96893-3_32

The memory model gurantee that we can promise in the image above is same as: https://www.cs.utexas.edu/~bornholt/post/memory-models.html. B and C can only make assumptions that they are writing on a state on top of A, but not on which will be executed first. Note that in the image, B and C are both marked in group 2, meaning that they are supposedly non-conflicting.

A validator is not to be trusted with serialization that they claim to be non-conflicting. Everyone should be able to verify this.

kianenigma on 16 Dec 2019

I wrote these notes from our discussion at the coding retreat:

We discussed 2.5 possible memory models for parallel transaction evaluation at Gavin's session on parallelization today.

Gavin's original suggestion:

All transaction execution threads have their own state write cache,
All parallel transaction execution threads read state freely from a (more) global state,
We rejoin the threads together by merely ensuring no conflicts occur among their their state write caches. Any conflicts are an invalid block.

We believe this approach corresponds to one of the weaker eventual consistency memory models used in multi-threaded execution. https://www.cs.utexas.edu/~bornholt/post/memory-models.html

In particular, you could easily make smart contracts whose sequential semantics becomes insecure under this memory model.

Jeff proposed we maintain full sequential consistency:

All transaction execution threads have both their own state write cache along with a state read subcache, but cannot access the global state without updating their read cache.
Any writes then replace entries from the read cache, so maybe you'd implement the read cache with flags on the write cache.
We rejoin finished threads together first merge all the read caches, second merge the write caches testing for disjointness among themselves, and third test for disjointness between the merged read and write caches. Any non-disjointness results in an invalid block, but we trash the read cache for a valid block.

We think this should be equivalent to the more "rust-like" model in which the spawn thread commands reserve portions of the state tree as mutable or immutable. At first blush, I'd think this reservation model runs faster but increases block size unacceptably.

We should support both these memory models in the code, and make smart contract platforms like ink use the full sequential consistency model, but attempt to prove much of the relay chain itself can run in the weaker model.

Set code conflicts with everything of course.
Alistair thinks ICMP works fine in the weaker model, but maybe ICMP writeup should address this.
Individual SPREE modules might or might not work in the weaker model, but we could likely show some simple one for DOT accounting worked in the weaker model.
We might address other relay chain transaction types as part of some future discussion about modularization of the relay chain.

After we have some designs like ICMP designed for the weaker model then we should use good auditors, like maybe Ralf Jung from the RustBelt project.

burdges on 16 Dec 2019

@burdges what does ICMP stand for? I only know of Internet Control Message Protocol, but that makes no sense here.

Demi-Marie on 28 Dec 2019

@burdges what does ICMP stand for? I only know of Internet Control Message Protocol, but that makes no sense here.

inter chain message passing, afaik

kianenigma on 28 Dec 2019

We've renamed it to XCMP for precisely that reason

burdges on 28 Dec 2019

have set_storage buffer the alteration in a (thread-local) hash map, which ultimately takes effect once all worker threads are finished and in the order that worker threads were created (and thus deterministic).

As discussed this changes the memory model; And more importantly, wouldn't it be a huge problem for Kusama? previous blocks, when executed with _sequential consistency_ might lead to different outcomes once executed with a runtime workers.

kianenigma on 1 Mar 2020

How is set_storage related to the memory model?

How should it affect kusama? Even if this would be work out of the box (without changing the runtime, what I don't expect), why should the storage root be different? If the storage root would be different, it would mean that there were multiple transactions modifying the same memory position. This would be a violation anyway.

I have some further questions:
Will you start with block production? Block production will inherently always be slower than import. So, optimizing import isn't probably the first priority?

bkchr on 1 Mar 2020

How is set_storage related to the memory model?

If you buffer you basically go from Sequential (SC) to TSO model. After a write, your thread will have access to an updated version whilst others will only ever see the initial state of the block, regardless of any schedule. The link above is a good read about it.

How should it affect kusama? Even if this would be work out of the box (without changing the runtime, what I don't expect), why should the storage root be different? If the storage root would be different, it would mean that there were multiple transactions modifying the same memory position.

You can get a different root even if there is just a _read-write_ conflict.

In a previous block, key k is initially 0 and k' is 2:
t1 writes 1 to k.
t2 reads from k and writes it to k'.
in that particular block, the block author chose to do t1 -> t2. k' is 1

if you re-execute the same block with a runtime worker, k' is 0 since the writes are buffered and then applied in the order of runtime threads.

This would be a violation anyway.

What do we consider a violation currently in our pure sequential model? I infer from your comment that two writes to the same key within a block is a violation? I don't see why it should be, given that everyone does the same thing deterministically.

Will you start with block production? Block production will inherently always be slower than import. So, optimizing import isn't probably the first priority?

Yeah indeed first block production. Import would come next and would be mostly about the follow up of this issue:

_ Once this exists, a new issue should be created to introduce extrinsic parallelism whereby the block author prefixes the extrinsic set with an extrinsic to specify a dependency graph stating which extrinsics are mutually independent and can be executed in parallel. The runtime executive can then use this information together with a number of worker threads in order to safely parallelise extrinsic queue execution._

Once an author can distribute a schedule, the role of the import would be simply to use it and import in parallel. That's in the future though.

kianenigma on 1 Mar 2020

Another fishy example that came to my mind: If we are to allow multiple transactions from the same origin in the same block, then the fee updates need to be precisely NOT buffered, otherwise one who submits n transactions within one block pays only for max(fee(n)) of those n.

Maybe we should also provide some memory fence to ensure such operations are actually visible to all other threads before we can move on.

kianenigma on 1 Mar 2020

Agreed, we cannot process multiple transactions spending from the same source in separate threads. All memory models discussed above should handle that particular case, but these memory models differ when some read the account and only one spends from the account. In the short term, we might simply forbid block from containing multiple transactions that require signatures by the same account.

burdges on 10 Mar 2020

I'll add sort of fork implementation prototype soonish we can iterate on :)

NikVolf on 16 Mar 2020

👍2

How is set_storage related to the memory model?

If you buffer you basically go from Sequential (SC) to TSO model. After a write, your thread will have access to an updated version whilst others will only ever see the initial state of the block, regardless of any schedule. The link above is a good read about it.

How should it affect kusama? Even if this would be work out of the box (without changing the runtime, what I don't expect), why should the storage root be different? If the storage root would be different, it would mean that there were multiple transactions modifying the same memory position.

You can get a different root even if there is just a _read-write_ conflict.

In a previous block, key k is initially 0 and k' is 2:
t1 writes 1 to k.
t2 reads from k and writes it to k'.
in that particular block, the block author chose to do t1 -> t2. k' is 1

if you re-execute the same block with a runtime worker, k' is 0 since the writes are buffered and
then applied in the order of runtime threads.

I don't understand this example :D

This would be a violation anyway.

What do we consider a violation currently in our pure sequential model? I infer from your comment that two writes to the same key within a block is a violation? I don't see why it should be, given that everyone does the same thing deterministically.

Okay, if we expect pure writes. Like someone for whatever reasons writes to a certain location some value that has no other connection, your view is correct. However, I think that this doesn't exist in a real world scenario. Let us for example just take the current weight of the block, that is a field that is read and written by a transaction. We need to synchronize these reads and writes, as otherwise the value will just be garbage.

Or think of a transaction where people want to put some money in a pot. Two different people want to do this in one block. Both transactions would need to happen in the same thread, as otherwise you would have a violation.

Will you start with block production? Block production will inherently always be slower than import. So, optimizing import isn't probably the first priority?

Yeah indeed first block production. Import would come next and would be mostly about the follow up of this issue:

I thought about this again and think that maybe doing import is easier and not that hard. You probably just need some sort of heuristic to distribute the tx between threads and start applying them. The good, if the state root is different you can just do single threaded import. For block production you probably not have enough time to switch again to the single threaded strategy.

bkchr on 16 Mar 2020

Agreed, we cannot process multiple transactions spending from the same source in separate threads. All memory models discussed above should handle that particular case, but these memory models differ when some read the account and only one spends from the account. In the short term, we might simply forbid block from containing multiple transactions that require signatures by the same account.

Speaking as a potential user of a blockchain, I don’t consider that acceptable. The problem is that I might submit a transaction, and then decide I want to submit yet another transaction. Artificially delaying the second one doesn’t seem like a good idea.

Demi-Marie on 25 Mar 2020

Also, sequential consistency is not enough. What we need is determinism.

The best solution I can come up with is software transactional memory, or STM. If we have a conflict, one of the transactions is rolled back and retried later. Obviously, the order in which transactions appear to have been executed needs to be in the block.

Demi-Marie on 25 Mar 2020

👍1

Updating that due to the workload from the staking side and my sightly limited availability int he next few months, I see it very unlikely that I can work on this.

kianenigma on 5 Jun 2020

Was this page helpful?

0 / 5 - 0 ratings