Refactoring Parachain Consensus in Cumulus

Proposing a refactor for Cumulus' consensus code.

## Requirements

The main issue with Cumulus’ current architecture is the ownership relationship between the collator and the parachain consensus code. The architecture currently has a `Collator` own a `T: ParachainConsensus`, which it invokes internally with a `build_collation` function. This is backwards: consensus should be at the top, as it may need to be highly specialized per parachain - especially as blockspace offerings such as on-demand parachains and elastic scaling evolve.

Collating takes vastly different inputs as it’s only required for liveness and needs to interface with external blockspace markets. Examples of these inputs:
  * Notifications of new relay-chain heads.
  * New pending transactions
  * Time
  * New parachain blocks

Things we should intend to support:
* Asynchronous backing: collation not driven by the relay chain (https://github.com/paritytech/cumulus/issues/2267)
* Parallelized collation authoring (when multiple execution cores at a time are possible)
* Ordering on-demand execution cores or other types of blockspace purchases.
* Pre-validation functions (https://github.com/paritytech/polkadot-sdk/issues/968)
* quorum-based collation (tendermint-style)
* slot-based collation (aura/sassafras-style)
* scraping the relay chain for upcoming claims on execution cores
* max execution times set by relay chain (https://github.com/paritytech/polkadot-sdk/issues/72)
* low-economic-security finality for parachain blocks, by having collators come to a consensus on an "inner" block which is then wrapped into a collation and can be re-submitted as many times as needed until it lands on the relay chain, even with different relay parents.

## Proposal

The general idea is that each collator should spawn a background _worker_ whose responsibility it is to actually create collations and share them in the network. We also need an import queue, just like Substrate, although we have a requirement for parachains which doesn’t exist in vanilla Substrate: the parachain runtime is responsible for verifying blocks’ seals. This is required in parachains, because the Polkadot validators need to check the seal as well.

The worker should have access to a  `import_and_share_collation` to actually import the block locally and share this collation to the network. Separation of _creating_ a collation and sharing it with the network is important, because it may be that a collation requires a quorum of collators. In that case, we need to create the collation, then collect a seal through a network protocol, then amend the collation, and only then share it.

Submitting blockspace orders or bids can be done in the same worker or in a separate background task. The general framework doesn't need to support this yet, but we should write this code (https://github.com/paritytech/cumulus/issues/2154)

We should also not remove any current `client/collator` or `aura` code, as it is depended on by outwards users. Instead, this should be built alongside, in a backwards-compatible way, giving users time to adjust their code.

## Aura

The new framework should not be merged until a fully backwards-compatible "aura" implementation is done alongside it, which can be swapped in by downstream users. This would come with a deprecation, not a deletion, of the old code for doing this.

Actually rewriting Aura logic is not ideal, so it’d be better to still wrap the `SlotWorker` as is currently done, even though it was not originally designed for this use-case. To do this, we need to modularize the `simple_slot_worker` a bit more. `fn on_slot` currently does these things
	1. Calculate remaining proposal duration
	2. Slot claiming (including skipping if certain criteria are not met)
	3. Creating a proposal + storage proof
	4. Sealing the block with a digest
	5. Importing the block

To be clear, the slot worker already has a lot of these things as individual functions, but `on_slot` still does a bunch of the work that we'd like to split out. Especially separating (3), (4), and (5).

These should be split out into separate helper functions, to the extent that they aren’t already. For basic aura, for instance, the worker logic should detect a slot (not based on real time, as is currently done), and then: compute (1) outside, then call into the aura slot worker for (2), (3), and (4), and then handle (5) itself alongside sharing to the network.

As for import queues: because of the “runtime needs to check seals” property, we can get away with simpler import queues that only do basic checks, like making sure a parablock’s slot isn’t in the future. These too should be customizable at the top level of Cumulus. For migrations to new consensus algorithms, the verifier should also be able to check the runtime API or version of the block, to internally delegate verification to different internal verifiers.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Refactoring Parachain Consensus in Cumulus #2301

Requirements

Proposal

Aura

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Refactoring Parachain Consensus in Cumulus #2301

Description

Requirements

Proposal

Aura

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions