Beacon sync maintenance update and header download fix #3269

mjfh · 2025-05-08T15:08:34Z

No description provided.

why: Must unlink closure that refers to the `hc` descriptor. `stop()` is also used in the destructor.

why: Might crash repeatedly while debugging.

why: Previously, `finalised` was local and static. So it was known that the chain `antecedent..finalised` was a segment on the canonical chain shared with the header chain cache. This was exploited in some fringe cases for `FC` parent detection when there was some interference with the FCU calls from the `CL` via RPC. Now, the `finalised` entry is maintained outside. So the assumption that `antecedent..finalised` is on the header chain cache does not necessarily hold anymore.

why: Previously, the `importBlock()` function was deterministic, i.e. non-async. This has changed. Now, this function returns a `Future` which can throw an `CancelledError` exception.

…rors why: PR #3204 introduced the concept of generally keeping the last peer even if syncing becomes nearly useless. This was an attempt to exploit the last peer even if it is labelled `slow`. This generality led to sort of syncer looping when fetching from a peer that repeatedly causes a disconnect exception so long as the peer is fully discarded by the p2p lib. This has been changed to keeping a `slow` sync peer if it is the last one, but zombify (i.e. discard and keep from reconnecting for a while) otherwise.

…ally why: This feature was accidentally removed in PR #3221. It has been re-added with a fat explaining comment. The problem was that under some circumstances, two peers were able to pretend to deterministically (i.e. addressed by hash) fetch different header ranges while in reality, they fetched the same range. This led the syncer to assume wrongly that all was downloaded while it was not. also: The fetch routine double checks that the block number at least of the first header is correct. So, fetching by hash (i.e. deterministically) will check that hash and received block number expect expectations.

arnetheduck · 2025-05-09T07:32:11Z

execution_chain/core/chain/header_chain_cache.nim

@@ -191,31 +186,42 @@ proc delInfo(db: KvtTxRef) =


 proc putHeader(db: KvtTxRef; h: Header) =
-  ## Store rlp encoded header
+  ## Store the argument `header` indexed by block number, and the hash lookup
+  ## of the parent header.


why the number of the parent header and not the current one?

Seems to avoid computing a hash, cc: @jangko

why: Previously (before PR #3204) the sync peer simply would have been zombified and discarded so that there were no sync peers left. Then PR #3204 introduced a concept of ignoring any error of the last peer via the `infectedByTVirus()` function. This opened a can of worms which was mitigated by PR #3269 by only keeping the last sync peer non-zombified if it was labelled `slow`. The last measure can lead to a heavy syncer slow down while queuing blocks if there is only a slow peer available. It will try to fill the queue first while it makes more sense to import blocks allowing the syncer to collect more sync peers. This patch registers the current peer as the last one labelled slow. It is up to other functions to exploit that fact.

* Discard peer immediately after `PeerDisconnected` exception why Otherwise it would follow the drill which is to be repeatedly tried again unless the maximum number of failures is reached. * Register last slow sync peer why: Previously (before PR #3204) the sync peer simply would have been zombified and discarded so that there were no sync peers left. Then PR #3204 introduced a concept of ignoring any error of the last peer via the `infectedByTVirus()` function. This opened a can of worms which was mitigated by PR #3269 by only keeping the last sync peer non-zombified if it was labelled `slow`. The last measure can lead to a heavy syncer slow down while queuing blocks if there is only a slow peer available. It will try to fill the queue first while it makes more sense to import blocks allowing the syncer to collect more sync peers. This patch registers the current peer as the last one labelled slow. It is up to other functions to exploit that fact. * Also start import while there is only one slow sync peer left. why: See explanation on previous patch. * Remove stray debugging statement

mjfh added 8 commits May 8, 2025 16:05

Remove cruft

c1bbe42

Cosmetics, docu and logging updates, etc.

ab2085d

Header Cache: Fix header chain cache stop() function

b20dfe3

why: Must unlink closure that refers to the `hc` descriptor. `stop()` is also used in the destructor.

Header Cache: Remove assert for missing entry to be deleted

78574ed

why: Might crash repeatedly while debugging.

Syncer: Handle CancelledError in block importer function

f6ea16d

why: Previously, the `importBlock()` function was deterministic, i.e. non-async. This has changed. Now, this function returns a `Future` which can throw an `CancelledError` exception.

mjfh merged commit 565d868 into master May 8, 2025
5 checks passed

mjfh deleted the Beacon-sync-maintenance-update-and-header-download-fix branch May 8, 2025 15:57

arnetheduck reviewed May 9, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Beacon sync maintenance update and header download fix #3269

Beacon sync maintenance update and header download fix #3269

Uh oh!

mjfh commented May 8, 2025

Uh oh!

Uh oh!

arnetheduck May 9, 2025

Uh oh!

mjfh May 9, 2025

Uh oh!

Uh oh!

Beacon sync maintenance update and header download fix #3269

Beacon sync maintenance update and header download fix #3269

Uh oh!

Conversation

mjfh commented May 8, 2025

Uh oh!

Uh oh!

arnetheduck May 9, 2025

Choose a reason for hiding this comment

Uh oh!

mjfh May 9, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!