Verified Proxy: Pre-fetch state using eth_createAccessList and add caching to EVM calls #3373

bhartnett · 2025-06-10T12:06:09Z

This is an example of how we can add caching to improve the performance of the RPC endpoints which use the EVM in the verified proxy.

I've also implemented an optimization where we call the downstream eth_createAccessList RPC endpoint to pre-fetch the expected account and storage keys and then fetch all the state using eth_getProof (slots for each account are batched together), and then the state is put in the caches before executing the EVM call.

In my testing the pre-fetching provides a reasonable speed up but the verified eth_call is still slower then the unverified eth_call due to the additional network calls required which is to be expected.

Adding support for connecting to the downstream RPC provider via WebSockets would likely improve performance further.

If the `baseTxFrame` is not updated, and `updateBase` yield to async event loop. Other module will access expired `baseTxFrame`. e.g. `getStatus` of eth/68 will crash the program.

* fix some Nim 2.2 warnings * copyright year linting * macOS Sonoma doesn't change oldest support x86 CPU type

* removes `_created` metrics from gauges (they should never have been there) * allow labelled metrics to be created from any thread

* Discard peer immediately after `PeerDisconnected` exception why Otherwise it would follow the drill which is to be repeatedly tried again unless the maximum number of failures is reached. * Register last slow sync peer why: Previously (before PR #3204) the sync peer simply would have been zombified and discarded so that there were no sync peers left. Then PR #3204 introduced a concept of ignoring any error of the last peer via the `infectedByTVirus()` function. This opened a can of worms which was mitigated by PR #3269 by only keeping the last sync peer non-zombified if it was labelled `slow`. The last measure can lead to a heavy syncer slow down while queuing blocks if there is only a slow peer available. It will try to fill the queue first while it makes more sense to import blocks allowing the syncer to collect more sync peers. This patch registers the current peer as the last one labelled slow. It is up to other functions to exploit that fact. * Also start import while there is only one slow sync peer left. why: See explanation on previous patch. * Remove stray debugging statement

Add access from History network to historical summaries for the verification of Capella and onwards block proofs. Access is provided by adding the BeaconDbCache to the history network, more specifical to the HeaderVerifier (before called Accumulators). This approach is taken, over providing callbacks, as it is more in sync with how StateNetwork accesses the HistoryNetwork. It might be still be considered to move to callbacks in the future though as that could provide a more "oracle" agnostic way of providing this data. The BeaconDbCache is created because for Ephemeral headers verification we will also need access to the Light client updates. Aside from the Light client updates, the historical summaries are also added to the cache in its decoded form for easy and fast access on block verification. Some changes are likely to be still required to avoid to many copies of the summaries, TBI.

* Bump nim-eth to latest.

…c node in handleFindContent (#3273)

…sest connected portal client (#3278) * Refactor state bridge to support sending each content to any of the connected portal clients sorted by distance from the content key.

When looking up a VertexID, the entry might not be present in the database - this is currently not tracked since the functionality is not commonly used - with path-based vertex id generation, we'll be making guesses however where empty lookups become "normal" - the same would happen for incomplete databases as well.

…3283) * Remove old wire protocol implementation * eth68 status isolated * eth69 preparation * Fix typo * Register protocol * Add BlockRangeUpdate * Use new receipt format for eth69 * Fix tests * Update wire protocol setup * Update syncer addObserver * Update peer observer * Handle blockRangeUpdate using peer state * Add receipt69 roundtrip test * Replace Receipt69 with StoredReceipt from nim-eth * Bump nim-eth * Bump nim-eth to master branch

* rm vestigial EIP-7873 support * bump nim-eth and nim-web3

)

* Rename fluffy to portal/nimbus_portal_client A bunch of renames to remove the fluffy naming and related changes - fluffy dir -> portal dir - fluffy binary -> nimbus_portal_client - portal_bridge binary -> nimbus_portal_bridge - + renamed related make targets - Restructure of portal directory for the applications (client + bridge) - Rename of default data dir for nimbus_portal_client and nimbus_portal_bridge - Remove most of fluffy naming in code / docs - Move docker folder one level up - Move grafana folder into metrics folder Items that are of importance regarding backwards compatiblity: - Kept make target for fluffy and portal_bridge - Data-dir is first check to see if the legacy dir exists, if not the new dir is created - ENR file is currently still name fluffy_node.enr * Move legacy files to new naming + create deprecation file Also fix nimble tasks * More fixes/changes - Change lock file name to portal_node.lock - Fix debug docker files - Fix portal testnet script * Mass replace for fluffy/Fluffy and portal_bridge in docs/comments * Address feedback regarding usage of binary name

* hoodi chain config: fix shanghai time typo * Validate built in chain config * Override compiler side effect analysis

* eth/69: Disconnect peer when receive invalid blockRangeUpdate * Add trace log when disconnecting peer * Use debug instead of trace to log blockRangeUpdate

* Code cosmetics, docu/comment and logging updates, etc. * Explicitly limit header queue length why Beware of outliers (remember law of iterated log.) also No need to reorg the header queue, anymore. This was a pre-PR #3125 feature which was needed to curb the queue when it grow too large. This cannot happen anymore as there is always a deterministic fetch that can solve any immediate gap preventing the queue from serialising headers. * Fix issue #3298 reason for crash The syncer will stop trying downloading headers after failing on 30 different sync peers. The state machine will advance to `cancelHeaders` causing all sync peers to stop as soon as they can without updating the bookkeeping for unprocessed headers which might leave the `books` in an open or non-finalised state. Unfortunately, when synchronising all simultaneously running sync peers, the *books* were checked for sort of being finalised already before cleaning up (aka finalising.) * Remove `--debug-beacon-sync-blocks-queue-hwm` command line option why Not needed anymore as the block queue will run on a smaller memory footprint, anyway. * Allow downloading blocks while importing/executing simultaneously why The last PRs merged seem to have made a change, presumably in the `FC` module running `async`. This allows for importing/executing blocks while fetching new ones at the same without depleting sync peers. Previously, all sync peers were gone after a while when doing this. * Move out blocks import section as a separate source module * Reorg blocks download and import/execute why Blocks download and import is now modelled after how it is done for the headers: + if a sync peer can import right at the top of the `FC` module, download a list of blocks and import right away + Otherwise, if a sync peer cannot directly import, then download and queue a list of blocks if there is space on the queue As a separate pseudo task, fetch a list of blocks from the queue if it can be imported right at the top of the `FC` module

* Restructure portal bridge folders.

- Hive has been updated from fluffy to nimbus-portal - Docker hub repo nimbus-portal-client has been created and latest build has been added there

- Update grafana dashboard to the latest one used for our fleet - Update that grafna dashboard to use Nimbus Portal naming - Remove some left-over fluffy naming

…offers (#3303) * Remove offer workers and replace using rate limiter for offers. * Use asyncSpawn for triggerPoke. * Add content queue workers to history and beacon networks. * Test random gossip and neighborhood gossip.

… prefix (#3313)

* swap block cache for header store; refactor * format * review and fixes * add tests for header store * remove unused headers * review and fixes * fixes * fix copyright info * fix copyright year * check order * earliest finalized * make cache len hidden

The current_sync_committee_gindex is fork dependant, this causes bootstrap validation issue since electra.

* feat: add admin_peers and admin ns (#3431) * feat: add admin_peers and admin ns * fix redundant boolean checks and import std sections * move caps in the main block * setup admin and quit combined into one call * fix compile issues * Add export marker * Fix tests * Restore invalid request exeception in admin_addPeer * Chicken and egg * oops * fix: string -> int for ports.discovery and listener (#3438) * fix: string -> int for ports.discovery and listener * use int not hex * fix test * Add export marker * Add comments --------- Co-authored-by: Barnabas Busa <[email protected]>

…ng (#3440) * Add cli param to enable stateless provider. * Create execution witness type and implement encoding/decoding.

* Transform FC module internals into DAG * Optimize updateFinalized * no changes to chain_private * More tuning

* add blocks support * add rpc handlers * reviews * format * catch only error exceptions * remove unused imports * review * add basics tests * fix

Deferred GC seemed like a good idea to reduce the amount of work done during block processing, but a side effect of this is that more memory ends up being allocated in certain workloads which in turn causes an overall slowdown, with a long test showing a net performance effect that hovers around 0% and more memory usage. In particular, the troublesome range around 2M sees a 10-15% slowdown and an ugly memory usage spike. Reverting for now - it might be worth revisiting in the future under different memory allocation patters, but as usual, it's better to not do work at all (like in #3444) than to do work faster. This reverts commit 3a00915.

Every time we persist, we collect all changes into a batch and write that batch to a memtable which rocksdb lazily will write to disk using a background thread. The default implementation of the memtable in rocksdb is a skip list which can handle concurrent writes while still allowing lookups. We're not using concurrent inserts and the skip list comes with significant overhead both when writing and when reading. Here, we switch to a vector memtable which is faster to write but terrible to read. To compensate, we then proceed to flush the memtable eagerly to disk which is a blocking operation. One would think that the blocking of the main thread this would be bad but it turns out that creating the skip list, also a blocking operation, is even slower, resulting in a net win. Coupled with this change, we also make the "lower" levels bigger effectively reducing the average number of levels that must be looked at to find recently written data. This could lead to some write amplicification which is offset by making each file smaller and therefore making compactions more targeted. Taken together, this results in an overall import speed boost of about 3-4%, but above all, it reduces the main thread blocking time during persist. pre (for 8k blocks persisted around block 11M): ``` DBG 2025-07-03 15:58:14.053+02:00 Core DB persisted kvtDur=8ms182us947ns mptDur=4s640ms879us492ns endDur=10s50ms862us669ns stateRoot=none() ``` post: ``` DBG 2025-07-03 14:48:59.426+02:00 Core DB persisted kvtDur=12ms476us833ns mptDur=4s273ms629us840ns endDur=3s331ms171us989ns stateRoot=none() ```

When updates to the MPT happen, a new VertexRef is allocated every time - this keeps the code simple but has the significant downside that updates cause unnecessary allocations. Instead of allocating a new `VertexRef` on every update, we can update the existing one provided that it is not shared. We can prevent it from being shared by copying it eagerly when it's added to the layer. A downside of this approach is that we also have to make a copy when invalidating hash keys, which affects branch and account nodes mainly. The tradeoff seems well worth it though, specially for imports that clock a nice perf boost, like in this little test: ``` (21005462, 21008193] 14.46 15.50 2,479.35 2,656.98 9m26s 8m48s 7.16% 7.16% -6.69% (21013654, 21016385] 15.28 16.14 2,523.74 2,665.83 8m56s 8m27s 5.63% 5.63% -5.33% (21021846, 21024577] 15.52 17.66 2,539.25 2,889.61 8m47s 7m43s 13.80% 13.80% -12.12% blocks: 16384, baseline: 27m10s, contender: 24m59s Time (total): -2m10s, -8.00% ```

fixes eth2 pointing to branch commit instead of unstable

* small bugfixes and cleanups across the board

…runtime (#3448) * Enable collection of witness keys in ledger at runtime via statelessProviderEnabled flag.

* Simplify FC node coloring * Optimize updateFinalized

* Schedule orphan block processing to the async worker * update processQueue * existedBlock to existingBlock

* Schedule `updateBase` to asynchronous worker. `updateBase` become synchronous and the scheduler will interleave `updateBase` with `importBlock` and `forkChoice`. The scheduler will move the base at fixed size `PersistBatchSize`. * Remove persistBatchQueue and keep persistBatchSize * fix tests * queueUpdateBase tuning * Fix updateBase scheduler * Optimize a bit updateBase and queueUpdateBase

* eth-call * format * fix

…t and get helpers (#3458) * Implement function to build execution witness from witness keys and proofs from database. * Clear witness keys in ledger after processing each block. * Add functions to persist and get witness by block hash. * Rename/restructure witness files.

…essList.

jangko and others added 30 commits May 10, 2025 18:54

Update FC.baseTxFrame after txFrame persisted (#3272)

14d597e

If the `baseTxFrame` is not updated, and `updateBase` yield to async event loop. Other module will access expired `baseTxFrame`. e.g. `getStatus` of eth/68 will crash the program.

fix some Nim 2.2 warnings (#3276)

8c8a176

* fix some Nim 2.2 warnings * copyright year linting * macOS Sonoma doesn't change oldest support x86 CPU type

metrics: bump (#3274)

d3215ef

* removes `_created` metrics from gauges (they should never have been there) * allow labelled metrics to be created from any thread

Fix number field not accessible when block header by hash (#3282)

59df5bd

Bump nim-eth to 5957dce55a4bfe00899ecc14006f72c6608b43df (#3285)

9d896dc

* Bump nim-eth to latest.

Fluffy: Use neighboursInRange in neighborhoodGossip and filter out sr…

4c4f6d9

…c node in handleFindContent (#3273)

Fluffy: Update the state bridge to send each content offer to the clo…

6341f5f

…sest connected portal client (#3278) * Refactor state bridge to support sending each content to any of the connected portal clients sorted by distance from the content key.

add a flag for disableMarchNative (#3286)

fb38003

rm vestigial EIP-7873 support (#3290)

28c7b73

* rm vestigial EIP-7873 support * bump nim-eth and nim-web3

bump nim-eth to addcbfa4394727dabacd26856beb2a1931b483f6 (#3295)

7c74bdd

Bump nim-chronos to b55e2816eb45f698ddaca8d8473e401502562db2 (#3296)

54e6193

Bump nim-libbacktrace to 99cd1a3f1568e7cfbbb6d886c93e4452dc65e4ef (#3297

3bd6188

)

Avoid persisting the same txFrame twice in updateBase (#3301)

175489f

Fix hoodi chain config (#3304)

54db56e

* hoodi chain config: fix shanghai time typo * Validate built in chain config * Override compiler side effect analysis

eth/69: Disconnect peer when receive invalid blockRangeUpdate (#3300)

37da150

* eth/69: Disconnect peer when receive invalid blockRangeUpdate * Add trace log when disconnecting peer * Use debug instead of trace to log blockRangeUpdate

Portal Bridge: Restructure portal bridge folders (#3308)

765b69f

* Restructure portal bridge folders.

Portal Client: Minor improvements to portal docs (#3309)

c2a67b7

Fix ContentDb and BeaconDb to close properly (#3307)

2183475

Rename fluffy refs now that hive and docker image are updated (#3310)

50a2821

- Hive has been updated from fluffy to nimbus-portal - Docker hub repo nimbus-portal-client has been created and latest build has been added there

Update to latest grafana dashboard + remove fluffy references (#3311)

48d2bba

- Update grafana dashboard to the latest one used for our fleet - Update that grafna dashboard to use Nimbus Portal naming - Remove some left-over fluffy naming

Portal Client: Add debug parameter to generate node id having a given…

db4ded3

… prefix (#3313)

Portal: Fix beacon lc bootstrap validation (#3316)

bda9c16

The current_sync_committee_gindex is fork dependant, this causes bootstrap validation issue since electra.

jangko and others added 20 commits July 2, 2025 13:40

Remove Portal access for block headers in EL (#3439)

c594c6d

Stateless: Create ExecutionWitness type and implement encoding/decodi…

1796fe7

…ng (#3440) * Add cli param to enable stateless provider. * Create execution witness type and implement encoding/decoding.

Transform FC module internals into DAG (#3441)

8a6877d

* Transform FC module internals into DAG * Optimize updateFinalized * no changes to chain_private * More tuning

More tuning to getPayloadBodiesByRange (#3442)

117b449

Fix fCU log too

9a4fd90

proxy: add blocks support (#3338)

8923504

* add blocks support * add rpc handlers * reviews * format * catch only error exceptions * remove unused imports * review * add basics tests * fix

Stateless: Simplify the witness table data structure (#3446)

1b1f07e

Add txFrame id to FC.validateBlock log (#3443)

65af601

nimbus-eth2: bump (#3450)

82d827c

fixes eth2 pointing to branch commit instead of unstable

vendor: maintenance bumps (#3451)

9f37bd5

* small bugfixes and cleanups across the board

Stateless: Make collection of witness keys in ledger configurable at …

379b4ff

…runtime (#3448) * Enable collection of witness keys in ledger at runtime via statelessProviderEnabled flag.

Simplify FC node coloring/updateFinalized (#3452)

49c907c

* Simplify FC node coloring * Optimize updateFinalized

Schedule orphan block processing to the async worker (#3453)

b3d3f91

* Schedule orphan block processing to the async worker * update processQueue * existedBlock to existingBlock

Restore conditional setupRpcAdmin activation (#3456)

0dc7cf3

proxy: add evm based functionality (#3454)

cc4ee6f

* eth-call * format * fix

chirag-parmar force-pushed the eth-call-vp-cache-state-lookups branch from 2585bad to d2cf1ba Compare July 8, 2025 13:00

chirag-parmar mentioned this pull request Jul 8, 2025

proxy: pre-fetch state using eth_createAccessList and add caching to EVM calls #3460

Merged

jangko and others added 7 commits July 8, 2025 22:02

Add missing finalized marker to updateBase (#3459)

b5e5b5e

Add caching for state lookups and pre-fetch state using eth_createAcc…

7a8e83f

…essList.

Run nph.

5989aa9

Return EVM output data in eth_call when EVM execution reverts.

309dbb1

format rebased code

bfc5754

fix rebase

e39c532

chirag-parmar force-pushed the eth-call-vp-cache-state-lookups branch from db495e7 to e39c532 Compare July 9, 2025 03:31

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Verified Proxy: Pre-fetch state using eth_createAccessList and add caching to EVM calls #3373

Verified Proxy: Pre-fetch state using eth_createAccessList and add caching to EVM calls #3373

Uh oh!

bhartnett commented Jun 10, 2025

Uh oh!

Uh oh!

Verified Proxy: Pre-fetch state using eth_createAccessList and add caching to EVM calls #3373

Are you sure you want to change the base?

Verified Proxy: Pre-fetch state using eth_createAccessList and add caching to EVM calls #3373

Uh oh!

Conversation

bhartnett commented Jun 10, 2025

Uh oh!

Uh oh!