Merkle update compression #1944

Mustang98 · 2025-12-11T02:17:57Z

Optimized compression of the MERKLE_UPDATE part of the block based on the efficient state usage. Leads to up to 50% decrease in the block size.

… decompression

… more broadcasts.

…ssion algorithm

…sion.

…record

…ed block

…r deserialize_block_full

…mpression

github-actions · 2025-12-11T02:18:05Z

@codex review

chatgpt-codex-connector · 2025-12-11T02:18:11Z

To use Codex here, create a Codex account and connect to github.

github-actions · 2025-12-11T02:34:16Z

@codex review

chatgpt-codex-connector · 2025-12-11T02:34:21Z

To use Codex here, create a Codex account and connect to github.

github-actions · 2025-12-11T02:56:48Z

@codex review

chatgpt-codex-connector · 2025-12-11T02:56:52Z

To use Codex here, create a Codex account and connect to github.

github-actions · 2025-12-11T03:14:32Z

@codex review

chatgpt-codex-connector · 2025-12-11T03:14:38Z

To use Codex here, create a Codex account and connect to github.

…n if we already checked them before state extraction.

github-actions · 2025-12-17T18:15:51Z

validator/full-node-serializer.cpp:196-222 and 299-317 – tonNode_blockBroadcastCompressedV2 and tonNode_dataFullCompressedV2 now carry the proof out-of-band (f.proof_) but the deserializers no longer enforce any size limit on it. Previously the max_decompressed_size cap applied to the combined proof+data inside the compressed blob; now an attacker can ship an arbitrarily large proof_ and we accept/move it without bounds, which can exhaust memory or blow up logging while still passing the size check on data_compressed_/block_compressed_.
validator/net/download-block-new.cpp:229-264 – when a peer sends dataFullCompressedV2 that requires state to decompress, we parse the unverified proof and immediately call wait_state_by_prev_blocks, which can trigger expensive state downloads/merges for attacker-chosen prev block IDs before any proof or signature validation. A malicious peer could spam crafted proofs to force repeated state fetches/merges and tie up resources.

github-actions · 2025-12-17T18:16:03Z

Findings

extract_prev_blocks_from_proof only handles full BlockProof (block::gen::BlockProof::Record) and fails on proof links, but shard block broadcasts/dataFull still carry proof links. All stateful-decompression paths call this helper (e.g. FullNodeShardImpl::obtain_state_for_decompression, FullNodeCustomOverlay::obtain_state_for_decompression, FullNodeFastSyncOverlay::obtain_state_for_decompression, DownloadBlockNew::got_data). For any shard block using the new state-based compressed V2 format, extract_prev_blocks_from_proof will return “invalid block proof” and the broadcast/download is dropped/aborted, making the new compression unusable for shard blocks. Ref: validator/full-node-serializer.cpp:133-159, validator/full-node-shard.cpp:864-882, validator/full-node-custom-overlays.cpp:68-86, validator/full-node-fast-sync-overlays.cpp:62-82, validator/net/download-block-new.cpp:229-264.

Notes

No other blocking issues spotted.

…lizer to distinguish validator_session and collator_node

github-actions · 2025-12-17T19:08:53Z

Two critical issues spotted in the new MERKLE_UPDATE compression handling:

validator/full-node-serializer.cpp still serializes broadcasts and full blocks in the old V2 layout (proof+data packed together and LZ4’d into compressed_; see serialize_block_broadcast around lines 20-67 and serialize_block_full around lines 120-156), but the TL schema now declares separate proof and data_compressed/block_compressed fields (tl/generate/scheme/ton_api.tl:433-466), and the new deserializers expect only one BOC root plus an external proof (validator/full-node-serializer.cpp:196-215, 250-277). Outbound messages won’t match the new schema, and the code will reject peers still speaking the old V2 (two-root) format. This is a protocol/compatibility blocker; serializers (and generated TL files) need to be updated to the new field layout and single-root compression.
In the block download path, shard state is now fetched before any cryptographic validation: validator/net/download-block-new.cpp:225-260 extracts prev blocks from the untrusted proof in a tonNode_dataFullCompressedV2 and calls wait_state_by_prev_blocks prior to verifying the proof/signatures. A malicious peer can respond with bogus dataFullCompressedV2 messages to force expensive state loads/merges and only fail later during decompression/validation. The state fetch should be gated on proof/signature verification (or at least minimal validation) to avoid this DoS vector.

No tests were added for these paths. If you want, I can draft the serializer changes for the new TL layout and add a guard in the download path to validate before fetching state.

…ip others because they aren't consistent with state.

github-actions · 2025-12-18T19:21:08Z

validator/full-node-serializer.cpp:133-152 assumes the broadcast proof is a BlockProof, but shard broadcasts/data responses carry a proof link. tlb::unpack_cell will fail for shard proofs, so extract_prev_blocks_from_proof returns an error and the new state-preload path drops shard blockBroadcastCompressedV2/dataFullCompressedV2 messages that require state (fast-sync/custom/public overlays and DownloadBlockNew). Shard blocks compressed with the new MU algorithm would never decompress. Needs to handle proof links too.
validator/full-node-serializer.cpp:188-191 builds a BlockBroadcast with static_cast<UnixTime>(f.catchain_seqno_); BlockBroadcast.catchain_seqno is CatchainSeqno (uint32). This braced init will narrowly convert a double and should not compile (and would store the wrong type if it did). Should construct with CatchainSeqno.

… to be before the promise.

github-actions · 2025-12-18T20:05:02Z

validator/full-node-serializer.cpp:133 extract_prev_blocks_from_proof unpacks only BlockProof. Shard block broadcasts/dataFull carry a BlockProofLink (ValidateBroadcast uses create_proof_link for non-mc), so unpacking fails → wait_state_by_prev_blocks never runs and every V2 shard broadcast/data-full that requires state is dropped. The new Merkle-update compression therefore cannot be used for shard blocks. Handle proof links in this helper (or select the correct parser based on shard) before requesting state.

github-actions · 2025-12-22T12:46:29Z

No issues found.

github-actions · 2025-12-22T12:54:55Z

crypto/vm/boc-compression.cpp:727-758 – build_prunned_branch_from_state drops the highest set bit from the pruned-branch mask (base_mask instead of the full mask_value) before computing level_mask. As a result max_level/is_significant ignore the top level and no hash/depth is stored for it, so reconstructed pruned branches under MerkleUpdate will have missing hashes/depths and yield incorrect cells during decompression.

github-actions · 2025-12-22T13:42:26Z

validator/net/download-block-new.cpp: in got_data the new state-aware path requests shard state (wait_state_by_prev_blocks) immediately after parsing the TL object, before any proof/hash/signature validation. A malicious peer can now respond to a download request with a forged dataFullCompressedV2 that advertises the “state-required” algorithm and arbitrary prev IDs in the proof, forcing the node to fetch/merge those states (I/O/CPU) before the response is rejected. Consider validating the proof/block id (or otherwise rate-limiting) prior to issuing state fetches, or deferring state acquisition until after basic checks to avoid a DoS vector.

github-actions · 2025-12-22T13:43:55Z

No issues found.

…ompression of the nested Merkle Updates and corresponding Prunned Branches

github-actions · 2025-12-31T03:59:47Z

No issues found.

github-actions · 2025-12-31T10:23:01Z

validator/net/download-block-new.cpp:236-262 fetches/merges shard states (wait_state_by_prev_blocks) based solely on an unverified dataFullCompressedV2 proof, before any proof/signature validation. A malicious peer can send crafted proofs to force expensive state downloads/merges as a DoS. Validate the proof (or otherwise authenticate the response) before requesting state, or defer state acquisition until after proof verification.

…on method" This reverts commit 4fee4bf.

… and decompression of the nested Merkle Updates and corresponding Prunned Branches" This reverts commit ff6f968.

…onstruction, based on level mask solely

github-actions · 2026-01-06T21:27:25Z

validator/net/download-block-new.cpp:229-264 – when a compressed V2 response claims it needs state, the code now extracts prev blocks from the unvalidated proof and immediately requests those states before checking that the block id matches the requested one or that the proof is genuine. A malicious peer can craft a fake compressed reply with arbitrary prev blocks and force the node to download/merge unrelated states (DoS). Validate the block/proof (or at least confirm block_id against the request and sanity-check the proof) before calling wait_state_by_prev_blocks.

github-actions · 2026-01-06T21:33:03Z

validator/net/download-block-new.cpp:225-259 – For tonNode_dataFullCompressedV2 the code requests state (and even merges prev states) before any proof/file-hash validation of the incoming data. A malicious peer can reply with a forged message that forces expensive state downloads/merges and only fails later during validation. Please validate the proof (or otherwise vet the message) before calling wait_state_by_prev_blocks, or defer the state fetch until after the proof is checked.

Mustang98 added 16 commits October 13, 2025 02:40

Activate improved compression.

c6ca56f

[merkle-update-compression] New V3 interfaces for block broadcast

5d60799

[merkle-update-compression] MU compression optimizations

b1b47f4

[merkle-update-compression] New interface

81719c5

[merkle-update-compression] Add asynchronious state extraction for V3…

2d54f92

… decompression

[merkle-update-compression] Refactoring, applying new compression for…

e02a8e4

… more broadcasts.

[merkle-update-compression] Revert experimental changes in the compre…

efd9259

…ssion algorithm

Copy depth balance optimizations into improved compression

8f05ab5

[merkle-update-compression] Add left MU subtree state-based decompres…

f7a8f36

…sion.

[merkle-update-compression] Fix previous block extraction from proof …

f12c44b

…record

[merkle-update-compression] Implemented signatures check for compress…

9d03312

…ed block

[merkle-update-compression] Use updated state extraction intefrace fo…

1b130ea

…r deserialize_block_full

Merge remote-tracking branch 'upstream/testnet' into merkle-update-co…

5dc2e60

…mpression

[merkle-update-compression] style format fix

2ca0c43

[merkle-update-compression] Rename new alogirthm

3db052c

[merkle-update-compression] refactor state extraction in custom overlay

d61f037

[merkle-update-compression] Remove unnecessary includes

610e143

[merkle-update-compression] small refactoring

596ed23

[merkle-update-compression] Remove redundant includes

315dffc

Mustang98 requested a review from SpyCheese December 11, 2025 03:16

Mustang98 mentioned this pull request Dec 11, 2025

Activate improved compression #1948

Open

[merkle-update-compression] Don't check signatures after decompressio…

d73c3e0

…n if we already checked them before state extraction.

[merkle-update-compression] Remove incorrect check

847ecc2

[merkle-update-compression] Extend benchmark logs for candidate_seria…

bb83c7c

…lizer to distinguish validator_session and collator_node

[merkle-update-compression] Compress only main Merkle Update tree, sk…

edffd1d

…ip others because they aren't consistent with state.

[merkle-update-compression] Reorder bool signatures_checked parameter…

5673ce7

… to be before the promise.

[merkle-update-compression] small refactoring

98c5841

[merkle-update-compression] Style fix

0762ab2

Mustang98 added 3 commits December 22, 2025 13:01

Merge branch 'testnet' into merkle-update-compression

a4344a7

[merkle-update-compression] Fix merge conflict

55ca643

[merkle-update-compression] Logs fix

5f73338

[merkle-update-compression] Corrected state-based compression and dec…

ff6f968

…ompression of the nested Merkle Updates and corresponding Prunned Branches

[merkle-update-compression] Replace prunned branch construction method

4fee4bf

Mustang98 added 4 commits January 6, 2026 20:20

Revert "[merkle-update-compression] Replace prunned branch constructi…

4df6040

…on method" This reverts commit 4fee4bf.

Revert "[merkle-update-compression] Corrected state-based compression…

a7cf416

… and decompression of the nested Merkle Updates and corresponding Prunned Branches" This reverts commit ff6f968.

[merkle-update-compression] Revert straightforward approach of PB rec…

fc29b86

…onstruction, based on level mask solely

Merge branch 'testnet' into merkle-update-compression

3f986f4

Merkle update compression #1944

Are you sure you want to change the base?

Merkle update compression #1944

Uh oh!

Conversation

Mustang98 commented Dec 11, 2025

Uh oh!

github-actions bot commented Dec 11, 2025

Uh oh!

chatgpt-codex-connector bot commented Dec 11, 2025

Uh oh!

github-actions bot commented Dec 11, 2025

Uh oh!

chatgpt-codex-connector bot commented Dec 11, 2025

Uh oh!

github-actions bot commented Dec 11, 2025

Uh oh!

chatgpt-codex-connector bot commented Dec 11, 2025

Uh oh!

github-actions bot commented Dec 11, 2025

Uh oh!

chatgpt-codex-connector bot commented Dec 11, 2025

Uh oh!

github-actions bot commented Dec 17, 2025

Uh oh!

github-actions bot commented Dec 17, 2025

Uh oh!

github-actions bot commented Dec 17, 2025

Uh oh!

github-actions bot commented Dec 18, 2025

Uh oh!

github-actions bot commented Dec 18, 2025

Uh oh!

github-actions bot commented Dec 22, 2025

Uh oh!

github-actions bot commented Dec 22, 2025

Uh oh!

github-actions bot commented Dec 22, 2025

Uh oh!

github-actions bot commented Dec 22, 2025

Uh oh!

github-actions bot commented Dec 31, 2025

Uh oh!

github-actions bot commented Dec 31, 2025

Uh oh!

github-actions bot commented Jan 6, 2026

Uh oh!

github-actions bot commented Jan 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant