-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Wp v2 #20340
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Wp v2 #20340
Conversation
arvidn
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as far as I can tell, you don't need MerkleTree.py. I think you can revert those changes, and use compute_merkle_set_root() instead.
| slot = block.finished_sub_slots[0].replace( | ||
| challenge_chain=block.finished_sub_slots[0].challenge_chain.replace(subepoch_summary_hash=None) | ||
| ) | ||
| block_no_challenge_root = block.replace(finished_sub_slots=([slot])) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are the parentheses deliberate? It they won't create a 1-tuple, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it has the same outcome but you are correct this is a mistake
| "chia.consensus.blockchain_mmr.BlockchainMMRManager.add_block_to_mmr", | ||
| lambda *args, **kwargs: None, | ||
| ) | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a feeling this monkey patching will come back and bite us later
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the problem here is that BlockchainMMRManager catches the error before blockHeader validation, i can change the expected error here, but this test is meant for header validation and blockchain and i dident want us to bypass, i can add the same test without the monkeypatch to check that the MMR manager catches this as well
chia/consensus/blockchain_mmr.py
Outdated
|
|
||
| _mmr: MerkleMountainRange | ||
|
|
||
| _checkpoints: dict[int, MerkleMountainRange] # height -> MMR snapshot |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we usually use uint32 for heights. If there's a special reason to be able to use -1 (or some other value that doesn't fit in uint32) it would probably warrant a comment
| _checkpoints: dict[int, MerkleMountainRange] # height -> MMR snapshot | |
| _checkpoints: dict[uint32, MerkleMountainRange] # height -> MMR snapshot |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can imagine this being more efficient, in the future:
- We could increase the distance between checkpoints the farther away from the peak we are
- We could reuse nodes between the checkpoints, as they probably have a lot of overlap
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i am looking into this
https://commonware.xyz/blogs/mmr its purposely built for blockchain use cases i need to explore the api to see if it has exactly what we need but basically the whole deal is an MMR database with rolleback capabilities
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
actually, my understanding of MMR is that you append new items. So rolling back should just be a matter of popping the most recent items. I don't think you need to store a history of previous MMRs.
In fact, you could even provide a view onto an MMR for an earlier peak, by just pretending the most recent items aren't there.
chia/consensus/blockchain_mmr.py
Outdated
| self._max_checkpoints = max_checkpoints | ||
|
|
||
| def copy(self) -> BlockchainMMRManager: | ||
| return BlockchainMMRManager(self) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it requires some work to make sure this really copies the content. I think it does, but wouldn't making this a dataclass support copying in a built-in manner?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dataclass doesnt provide deep copy, but i agree that using dataclass is more like what we usually do and i will change and use copy.deepcopy instead of the custom logic
| blocks_added_this_sub_slot += 1 | ||
| blocks[full_block.header_hash] = block_record | ||
| # Add block to MMR manager for proper MMR computation | ||
| self.mmr_manager.add_block_to_mmr( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and here, should you really do this for heights < constants.HARD_FORK2_HEIGHT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh theres a nuance here we include the mmr root only above the fork, question is since when do we add blocks to the mmr
currently this assumes from genesis, which is probably not what we want to do (although we wont be able to prove inclusion for prefork blocks), this relates to the what Bram said about doing the checkpoint, we can just opt to start adding from the fork ill do that and will make sure with Bram
this is only relevant to the mmr commitment since the other one is per sub epoch
in any case still need to add tests that start an already synced node to make sure that this works not only when syncing from 0 but also when restarting a node, this scenario is not tested yet
| hh = block.header_hash | ||
| self._block_records[hh] = block | ||
| self._height_to_hash[block.height] = hh | ||
| self.mmr_manager.add_block_to_mmr(block.header_hash, block.prev_hash, block.height) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
one thing we need to be careful with here is the consistency with the blockchain database. Right now, the coin store, the full blocks, peak and height-to-hash all needs to stay in sync, even with a power outage or kernel panic. The MMR manager is similar to the coin store in that it's tied to a specific peak, with a specific chain.
If the database update fails, we may need to roll back this. Or perhaps this only happens after the DB update succeeds.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True we need to add checks when starting up that the peak in the mmr and the peak in the block store are the same
| prev_b_hash=header_block.prev_header_hash, | ||
| sp_index=header_block.reward_chain_block.signage_point_index, | ||
| first_in_sub_slot=len(header_block.finished_sub_slots) > 0, | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bug: Duplicate calculation of pre_sp_tx_height in validation
The function pre_sp_tx_block_height() is called twice with identical parameters within validate_finished_header_block() - once at lines 119-125 and again at lines 1076-1082. Since this function traverses block records to find the pre-signage-point transaction block height, calling it twice is wasteful. The variable pre_sp_tx_height from the first calculation at line 119 is already in scope and can be reused at line 1076.
Additional Locations (1)
chia/consensus/blockchain_mmr.py
Outdated
|
|
||
| _mmr: MerkleMountainRange | ||
|
|
||
| _checkpoints: dict[int, MerkleMountainRange] # height -> MMR snapshot |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
actually, my understanding of MMR is that you append new items. So rolling back should just be a matter of popping the most recent items. I don't think you need to store a history of previous MMRs.
In fact, you could even provide a view onto an MMR for an earlier peak, by just pretending the most recent items aren't there.
chia/consensus/blockchain_mmr.py
Outdated
| self._last_header_hash = block_record.header_hash | ||
| self._last_height = uint32(height) | ||
| except Exception as e: | ||
| log.warning(f"Could not find block at height {height} during MMR rollback: {e}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| log.warning(f"Could not find block at height {height} during MMR rollback: {e}") | |
| log.exception(f"Could not find block at height {height} during MMR rollback") |
Feel free to resolve this comment if you prefer warning().
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this should never happen so your right, we should log.exception
chia/consensus/blockchain_mmr.py
Outdated
| _mmr: MerkleMountainRange = field(default_factory=MerkleMountainRange) | ||
| _last_header_hash: bytes32 | None = None | ||
| _last_height: uint32 | None = None | ||
| _checkpoints: dict[uint32, MerkleMountainRange] = field(default_factory=dict) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't actually think you need the checkpoints. Since new items are appended to an MMR (and new parents are created on-demand), you can pretty easily just pop items as well, to restore to an earlier state.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed
chia/consensus/mmr.py
Outdated
| """ | ||
|
|
||
| nodes: list[bytes32] | ||
| size: uint32 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if this isn't just len(nodes), I think it warrants a comment.
I suppose this is the number of leaves in the MMR. That should still be computable from len(nodes), I would expect.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, ill add a comment, this is the number of blocks we aggregated
| Flat MMR implementation. | ||
| """ | ||
|
|
||
| nodes: list[bytes32] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be an important performance improvement to make this a bytearray instead. But these kinds of optimizations can wait.
| return peaks | ||
|
|
||
|
|
||
| def leaf_index_to_pos(leaf_index: int) -> int: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this function should have a unit test with the interesting edge cases, and a human-readable illustration of the shape of the MMR and what the expected result is, for the various cases.
I imagine we would cover a single complete tree, two peaks and 3 peaks.
| assert mmr.get_root() == expected_root | ||
|
|
||
|
|
||
| def test_flat_mmr_peak_positions() -> None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would really help this test to have an ASCII-art illustration of the MMR.
Something like this one:
Height
3 14
/ \
/ \
/ \
/ \
2 6 13
/ / \
1 2 9 12 17
/ \ / / \ /
0 0 1 7 10 11 15 18
I also think it would be helpful to remind the reader of what the indices refer to. The input index is the leaf, not MMR index, right?. The return values are indices if the peaks, so not given numbers on the illustration above.
| @pytest.mark.skip("Height calculation not needed - we store heights directly") | ||
| def test_flat_mmr_height_calculation() -> None: | ||
| """Test height calculation from position""" | ||
| assert get_height(0) == 0 # Leaf |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this test also would benefit of an ASCII-art illustration of the tree
| def random_bytes32() -> bytes32: | ||
| return bytes32(os.urandom(32)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can also use bytes32.random(), and we have a test fixture that provides a random.Random context, which give you deterministic values
| root2 = mmr2.get_root() | ||
| for i in range(10): | ||
| assert mmr.nodes[i] == mmr2.nodes[i] | ||
| assert root2 == root |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in addition to these round-trip tests for proofs, I think you should have a few concrete test cases that check that the resulting proof has exactly the nodes we expect it to have. Just to make sure the result is valid, not just that it validates in the validation function
| block = blocks.height_to_block_record(uint32(height)) | ||
| mmr.append(block.header_hash) | ||
|
|
||
| return mmr.get_root() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bug: MMR root rebuild requires uncached history
BlockchainMMRManager._build_mmr_to_block() rebuilds the MMR by iterating from height 0 and calling blocks.height_to_block_record(). In Blockchain, height_to_block_record() only works for in-memory cached BlockRecords, so after startup (only last cache window loaded) or after cache eviction this can raise or produce incorrect header_mmr_root, breaking post-fork block creation/validation.
Additional Locations (1)
| log.exception(f"Could not find block at height {target_height} during MMR rollback: {e}") | ||
|
|
||
| self._last_header_hash = target_block.header_hash | ||
| self._last_height = uint32(target_height) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bug: MMR rollback can crash on genesis
rollback_to_height() catches failures from blocks.height_to_block_record() but still dereferences target_block, which can be unassigned, and it doesn’t handle target_height < 0. Callers pass height - 1 (e.g. removing height 0 gives -1), which can trigger an exception and leave the MMR manager in a bad state.
Purpose:
Implement new Weight Proof Protocol with MMR-based proofs and optimized sub-epoch challenge segment tracking.
this will allow smaller more efficient proofs that are more secure
Current Behavior:
New Behavior:
V3 Weight Proofs: we add MMR roots to the RewardChainBlock aggregating the header hashes of the previous
blocks using the MMR roots we will be able to prove block inclusion by header hash, we will later use this in the new
protocol to sample individual blocks and not full challenge slots like the current protocol.
the MMR root is hashed into the RewardChain VDF so it can't be manipulated without affecting the challenge chain.
we also add merkle roots committing to the number of blocks in each challenge slot in the sub-spoch so we can know
what slot each block in the sub epoch belongs to without downloading more then the actual block
MMR Infrastructure: New MMRManagerProtocol for tracking block header commitments
Sub-Epoch Challenge Tree infrastructure: Track and store challenge chain segments data
moved MerkleTree from wallet.util to chia.util
Testing Notes:
Note
Introduce MMR-based header commitments and sub-epoch challenge merkle roots (post-HF2), wiring them through consensus, full node/timelord flows, weight proofs, and tests.
MMRManagerProtocol,BlockchainMMRManager, and flat MMR (consensus/mmr.py); integratemmr_managerintoBlockchain,AugmentedBlockchain,BlockCache, andWalletBlockchain(copy/rollback/add/update, newget_mmr_root_for_block/get_current_mmr_root).reward_chain_block.header_mmr_root(viaunfinished_block_to_full_block_with_mmr); validate MMR root invalidate_finished_header_blockafterHARD_FORK2_HEIGHT; addpost_hard_fork2()helper andskip_commitment_validationpath for weight proofs.consensus/challenge_tree.py; extendmake_sub_epoch_summaryto includechallenge_merkle_rootpost-HF2; propagate through validation and SES handling.NewUnfinishedBlockTimelord; timelord uses providedheader_mmr_rootwhen finishing blocks.post_hard_fork2to conditionally include SES challenge roots.challenge_merkle_rootinSubEpochData/reconstruction; skip commitment validation during WP verification.BlockToolsmaintains anBlockchainMMRManagerduring block generation; pass through to block building paths.test_mmr.py,test_block_commitments.py, sub-epoch summary tests); update fixtures for fork heights; adjust mocks toStubMMRManager; minor test updates for API changes.timelord_protocol.NewUnfinishedBlockTimelordwithheader_mmr_root; update test vectors/json accordingly.Written by Cursor Bugbot for commit 87ebec0. This will update automatically on new commits. Configure here.