Skip to content

eth/handler: more reliable announcements #22249

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

holiman
Copy link
Contributor

@holiman holiman commented Jan 29, 2021

This is based on to #22241, but could really be applied on master aswell. While working on eth66, where I used one 'relayer' and one 'isolated' node which spoke only to the relayer, I noticed that block import did not happen one by one, but rather in batches whenever it got 'far enough' behind.

After diving deep into it, I concluded that the problem was:

  • When a relayer gets a block via broadcast, but the block is 1 second future from Now, the block
    • was not broadcast
    • It was attempted to import (but blockchain would schedule it for import a little while later)
    • Was sent to the notification-framework
      • Which also marks the hash as known at isolated.
      • However, right before sending the announcement, we check if we have it in chain (we don't want to announce non-canon blocks). This check failed, and we therefore neither broadcast nor announce the block.
    • If we have multiple peers, like relayer does, he gets the same block a split second later, and it gets imported properly.
    • This time, it's not announced to isolated, since he's already marked as having it.

This all leads to isolated missing out on announcements.

There are several ways to handle this, and I tried a few, end eventually settled on performing a small wait (up to 2 seconds) in the block fetcher, if it notices that a given block is indeed future, but the timestamp is <=2s from now. At that point, we're already executing in a goroutine, and sleeping + doing successfull import there prevents the hash from being sent to the done channel prematurely.

Anyway, the whole point of this particular PR is the last commit. With that, my two goerli-on-eth66 nodes work fine with eachother, progressing in lock-step block by block

This PR also fixes what I think is an oversight during queue processing in the fetcher -- but maybe it was intentional and I have just overlooked some aspect.

EDIT: The fix I made, eventually, is just a band-aid on a broken thing. I think a more correct fix would be to have two separate paths: broadcasts done as now, and announcements triggered/sent based on a subscription of imported blocks. That way, we would be guaranteed that regardless of how the block is imported (via fetcher or via procFutureBlocks), the block would be announced, and at the time of the announcement, the block is fully imported)

@holiman
Copy link
Contributor Author

holiman commented Sep 22, 2021

Rebased on master, but should be revisited once request-id's are in

case consensus.ErrFutureBlock:
// Weird future block, don't fail, but neither propagate

// Future block, don't fail, but neither propagate.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Future block, don't fail, but neither propagate.
// Future block, don't fail, but also don't propagate.

slightly clearer?

@holiman
Copy link
Contributor Author

holiman commented Sep 2, 2022

I guess we don't need to fix block propagation, only weeks left for the merge

@holiman holiman closed this Sep 2, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants