Skip to content
This repository was archived by the owner on Jan 24, 2024. It is now read-only.

ref impl: find starting point of sync, given L1 and L2 interfaces #130

Merged
merged 12 commits into from
Jan 20, 2022

Conversation

protolambda
Copy link
Collaborator

@protolambda protolambda commented Jan 10, 2022

Part of #119: staging -> main migration

This:

  • Implements a "sync reference" interface, and implementation that implements it based on common RPC source interfaces. To mock chain status easily for sync starting-point testing.
  • Implements algorithm from @karlfloersch (with missing edge cases handled) to find the starting point for sync, statelessly (i.e. no rollup-node storage requirement, nor in-memory chain/tree)

Depends on #129

Review: any team.

Spec: update reference how we find the starting point of sync.

Testing: different starting point edge cases done

@@ -59,7 +59,7 @@ func FindSyncStart(ctx context.Context, reference SyncReference, genesis *Genesi
// Search back: linear walk back from engine head. Should only be as deep as the reorg.
for refL2.Number > 0 {
// remember the canonical L1 block that builds on top of the L1 source block of the L2 parent block.
nextRefL1 = refL1
nextRefL1 = currentL1
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was a bug, found with new tests, fixed now. When traversing back, remember the actual L1 hash as canonical, not the L1 hash from the engine block

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clarifying for further reviewers: the version in this PR is the fixed version.

@maurelian
Copy link
Collaborator

I'm having trouble running this branch. I think it's likely an issue with the CLI config, as it doesn't recognize the flags specified here.

My steps:

  1. Checkout this branch.
  2. From the root dir:
go mod download
go build -o rollupnode ./opnode/cmd
  1. when I run rollupnode run -h, the only output is
(command)

By contrast, on the staging branch I get a proper help menu when I run ./rollupnode run -h.

@protolambda
Copy link
Collaborator Author

@maurelian it's missing the code from PRs that build on top of this to complete the rollup node implementation. Most notably the driver and node PRs. If you need to run the latest changes, then run #132 and merge the commits made on top of these feature branches.

Copy link
Contributor

@norswap norswap left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The function really needs a comment explaining its behaviour.

I'm also currently a bit puzzled by the fact it will sometimes return an old L2 block, but it might dawn on me why that makes sense when I read the rest of the sync logic!

@@ -59,7 +59,7 @@ func FindSyncStart(ctx context.Context, reference SyncReference, genesis *Genesi
// Search back: linear walk back from engine head. Should only be as deep as the reorg.
for refL2.Number > 0 {
// remember the canonical L1 block that builds on top of the L1 source block of the L2 parent block.
nextRefL1 = refL1
nextRefL1 = currentL1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clarifying for further reviewers: the version in this PR is the fixed version.

nextRefL1 = currentL1
refL1, refL2, parentL2, err = reference.RefByL2Hash(ctx, parentL2, genesis)
if err != nil {
// TODO: re-attempt look-up, now that we already traversed previous history?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does this TODO mean?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the reorg is deep, it may have to traverse far to find the common prefix. If we exit on the first API failure along the way it has to start over again. Retrying the first few failures would improve that worst-case.

@maurelian
Copy link
Collaborator

I think we might need to rebase this on main in order to get a coverage report on CodeCov.

@protolambda
Copy link
Collaborator Author

protolambda commented Jan 19, 2022

*edit: commented on wrong PR

Base automatically changed from impl-driver-step to main January 19, 2022 17:57
@protolambda
Copy link
Collaborator Author

Rebased to main:

  • build on latest ref impl, fix merge conflicts
  • resolves rollup-node spec updates (main also changed rollup-node spec)

@protolambda protolambda requested a review from norswap January 20, 2022 16:26
@protolambda
Copy link
Collaborator Author

@norswap please re-review or approve if the updates seem good to you.

@protolambda protolambda merged commit 03a75b5 into main Jan 20, 2022
@protolambda protolambda deleted the impl-sync-start branch January 20, 2022 18:42
L2 eth.BlockSource
}

// RefByL1Num fetches the canonical L1 block hash and the parent for the given L1 block height.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just wondering if it was intentional to make l1Num in RefByL1Num a uint64 while l2Num in RefByL2Num is a *big.Int

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm guessing it is because we never care for querying the "latest" L1 block?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

uint64 is better and more simple, but the standard RPC uses *big.Int (also because a nil argument is a special API value for requesting the head).

rollup driver should not result in errors assuming conformity with the specification. Said otherwise, all errors are
implementation concerns and it is up to them to handle them (e.g. by retrying, or by stopping the chain derivation and
requiring manual user intervention).

The following scenarios are assimilated to errors:

- [`engine_forkchoiceUpdatedOPV1`] returning a `status` of `"SYNCING"` instead of `"SUCCESS"` whenever passed a
- [`engine_forkchoiceUpdatedV1`] returning a `status` of `"SYNCING"` instead of `"SUCCESS"` whenever passed a
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There will no longer be an optimism specific set of engine RPCs but instead the ones that need to be modified will be modified?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change is backwards compatible, going to try and upstream it before we commit to an optimism-specific version

A [transaction deposit][transaction deposits] is an L2 transaction that has been submitted on L1, via a call to the
[deposit feed contract].

Refer to the[**deposit feed contract specification**][deposit-feed-spec] for details on how
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Refer to the[**deposit feed contract specification**][deposit-feed-spec] for details on how
Refer to the [**deposit feed contract specification**][deposit-feed-spec] for details on how

@karlfloersch
Copy link
Contributor

Awesome @protolambda - just reviewed & looked through the reorg logic and it LGTM! I remember the algorithm that you described & it seems what you wrote up is pretty much the same as that--the main complexity coming from needing to hold on to both the current block and next block on both L1 and L2. Anyway it looks great to me!!!

The one thing to highlight is when we move to sequencer rollup we will need to start our reorgs from the latest safe L2 block as opposed to the tip block for L2. This is because the reorg logic needs to be able to tell what the last block it has checked against L1 is. This is so we can preserve the property that all L2 blocks are checked by the driver against L1 even if the sequencer propagates unsubmitted blocks to the verifiers.

More details--in sequencer rollup the driver needs to know what L2 block it last checked against L1 if we are to allow the sequencer to propagate unsubmitted blocks. This attack is possible because the sequencer would be able to lie about the L1 block hash in the unsubmitted blocks, and by lying it could convince the driver that it doesn't need to re-execute the L2 block even if it is invalid. Let me know if that is clear, I don't yet know how to clearly explain this property.

Anywho this doesn't apply to deposit rollup and it's suuuper exciting to have a stateless driver!!!!!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants