Substrate: Re-Genesis

Created on 29 Oct 2020  路  10Comments  路  Source: paritytech/substrate

This documents some of notes and designs of a Re-Genesis process. Re-Genesis is basically the process of exporting the current chain state, and create a new chain building on it.

Rationale

The discussions started as an alternative method to Swappable Consensus (#1304). Many consensus engines we have right now (like BABE) make assumptions about the chain state, block numbers, among other things, so a direct consensus swapping will require some heavy modification of the consensus engines themselves. In addition, custom migration code must be written individually for each possible swapping.

Re-Genesis, on the contrary, is much simpler. If implemented with care, it can accomplish the same thing as Swappable Consensus. We do not need to modify existing consensus engines to remove their assumptions, but just need to make switching and restarting a runtime plus consensus engine combination fast.

Re-Genesis can also be used for other purposes that Swappable Consensus is not able to cover:

  • Replace faulty runtime upgrades.
  • As a hard fork process.
  • As a way to "squash" the chain and reduce syncing time.
  • Carry out stop-the-world migrations more smoothly and reliably.

Design

Choosing the Re-Genesis block

A Re-Genesis process divide a blockchain into eras. If a blockchain is considered in era N prior to Re-Genesis, it becomes in era N + 1 post Re-Genesis. At each era, the block number starts from 0. So we can refer to blocks as "era N block M".

The first question is how we choose the Re-Genesis block.

We can always choose the head block at a particular height, but that would not be reliable. There can be multiple such blocks at the same time, and if the state rebuilding process is heavy, allowing it to be switched around is an attack vector.

Instead, we define the Re-Genesis block as a finalized block at a particular height (for chains with finalization), or a block at a particular height with siblings of depths at least D (for chains with probabilistic finalization). This means that when switching from era N to era N + 1, upon the Re-Genesis block, the old era N chain will continue to build blocks and states, but those built blocks and states will not be accounted for in the new era. Instead, they're only there to make the possibility of having multiple Re-Genesis blocks low.

Stopping the old era chain

Having the old era N chain continuing to build blocks and states is definitely not ideal. So we can work on additional support for the runtime to stop the old era chain. The chain stopping process consists of two steps:

  • First, the chain state is frozen. No balance can be transferred. No new proposals can be submitted. The validator set is frozen. No reward will be issued. The existing validator set continue to build blocks.
  • Upon a finalization block at the Re-Genesis height, the runtime then issues a setCode command with an empty code, to permanently shut down the code chain.

Starting the new era chain

Substrate users define their own migration script. The migration will obviously define the initial parameters of the new consensus engine. For the rest of the states, Substrate users can cherry-pick what they want and discard others -- either taking the full state over, or just take the balances and other essential things.

After migration, this new state is then set as the genesis block state for era N + 1, and a new chain continues to function beyond this point.

We note that the difference of a Re-Genesis process and a complete new blockchain, is that the genesis state for a Re-Genesis process is not known until the Re-Genesis block is identified.

Discussions

Light client

Light client implementations differ by consensus engines. As a result, no matter using Swappable Consensus or Re-Genesis, they may not work accross the border. Substrate users may have to ask node users to manually switch light clients, upon Re-Genesis.

Missed time

During the Re-Genesis process, we note there's a stop-the-world migration. Even if that is fast, to identify the Re-Genesis block, time has to be spent on the old era chain to finalize the Re-Genesis block. This will result in a period of time when no actual blocks with state is building for the blockchain.

UX issues

Re-Genesis introduces a new concept called "era", and compared with Swappable Consensus, the new era's block starts their block numbers from 0 again. This can be an UX issue that we should take care of.

Prior usages

The only real-world usage right now (relying on an ad-hoc Re-Genesis process) was Kulupu's era switch at era 0 block 320,000. The process was almost like above, but everything was done manually (with a new node released after Re-Genesis block).

Edgeware also considered Re-Genesis for its first runtime upgrade, but decided against it due to UX concerns.

Most helpful comment

An eon is a unit that's bigger than era and is composed of eras, so that sounds appropriate.

All 10 comments

cc @andresilva

What happens to past era extrinsics / events for the purpose of auditing (tax etc)? Can people still rebuild the previous era with archive nodes?

@Swader They should always be able to do that.

Right now I'm thinking about each era using different networking identifier and storage location for simplicity (that is, if we indeed decided to go towards the Re-Genesis direction), but the UX definitely can be improved.

The thing I'm wondering is, right now a full node can become an archive node without needing any communication from other nodes, just based on its extrinsics which it keeps no matter the pruning mode. A full node of era 1 will not be able to do that, presumably. Would this potentially cause an availability rift if no one were to be running a full node of era 0 any more?

@Swader Yeah indeed. But the chance that not a single person runs era 0 full node is quite slim, IMO.

Agreed, just putting it out there as a there is a chance.

I think this functionality is interesting, and I'd like to see it in Substrate. I don't think Polkadot would use this (because of the slight chance of missing past era availability), but I could definitely see Kusama undergo a new era launch every 5 million blocks or so 馃憤

@Swader That is the same problem we will have when we implement warp syncing since nodes will stop downloading the history from before the snapshot point (or at least that was the case with our implementation in parity-ethereum). Normal node operation would still be to sync through all eras and import everything (potentially to different database locations on-disk but that's an implementation detail), so all the data would have the same availability guarantees it has today. The main driving point of this feature is as a potential implementation for swappable consensus, which we'd want to use in the future in Polkadot (e.g. for migrating from BABE to SASSAFRAS).

Light client

I think the light client would just have to start syncing from the latest era. I think this is OK since on PoS chains the light clients already cannot be trusted from genesis due to weak subjectivity.

Right now I'm thinking about each era using different networking identifier

This might make it harder to allow serving clients on all eras, but didn't check what changes would be needed on networking.

UX issues

I think ideally we'd want to avoid resetting the block numbers and just keep incrementing them across eras. From the client-side this might be doable just by maintaining an offset. For the runtime though not sure if that is enough since we might have state entries referencing block numbers from previous eras. I think we might need to remove the assumption that the genesis block is #0, and instead pickup the block number from the last era.

Ultimately the networking should be capable of "connecting" to multiple different chains (#3310), in other words to support multiple different chains/eras at the same time, provided each chain/era has a different protocolId.

If however we don't reset the block number to 0, there's no change required on the networking.

Is the name "era" intentionally similar to staking eras? If not I would suggest different naming to avoid confusion.

An eon is a unit that's bigger than era and is composed of eras, so that sounds appropriate.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

gavofyork picture gavofyork  路  4Comments

jiangfuyao picture jiangfuyao  路  5Comments

AurevoirXavier picture AurevoirXavier  路  3Comments

xlc picture xlc  路  5Comments

andresilva picture andresilva  路  3Comments