Skip to content

core, triedb/pathdb: final integration (snapshot integration pt 5) #30661

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 9 commits into from
May 16, 2025

Conversation

rjl493456442
Copy link
Member

@rjl493456442 rjl493456442 commented Oct 23, 2024

In this pull request, snapshot generation in the pathdb is ported from the legacy
state snapshot. Additionally, in path mode, the legacy state snapshot is now
handled by the pathdb-based snapshot implementation.

Note: the existing snapshot data will be re-generated regardless of it was fully constructed or not

@rjl493456442 rjl493456442 force-pushed the snapshot-integration-p5 branch from d249035 to 133900e Compare October 23, 2024 07:24
@holiman holiman changed the title Snapshot integration p5 core, triedb/pathdb: final integration (snapshot integration pt 5) Oct 23, 2024
@@ -362,6 +430,7 @@ func (db *Database) Enable(root common.Hash) error {
// reset the persistent state id back to zero.
batch := db.diskdb.NewBatch()
rawdb.DeleteTrieJournal(batch)
rawdb.DeleteSnapshotRoot(batch)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please explain why we need to delete the snapshot root here, I don't really understand

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This database entry indicates that the snapshot data on disk is associated with the specified state root.

The function Enable(root) means the trie node data on disk associates with the given state root, any leftover snapshot data should be purged and regenerated.

Deleting the SnapshotRoot entry signifies that all leftover storage data is considered obsolete.

}
}
}
for addrHash, storages := range storageData {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering whether it makes a difference in the speed of flushing if we sort the keys before trying to insert them, iirc some databases handle ordered key insertions better than random keys. This way we could also break on the first key that exceeds the genMarker

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The random insertion here won't significantly impact the performance, as the data receiver is a in-memory batch. But I would love to measure the performance difference later.

// into two parts.
func splitMarker(marker []byte) ([]byte, []byte) {
var accMarker []byte
if len(marker) > 0 { // []byte{} is the start, use nil for that
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to check here that len(marker) >= HashLength?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the generation marker could only be:

  • []byte{}
  • []byte{common.AddressLength}
  • []byte{common.AddressLength + common.HashLength}

If the marker is corrupted, then the system could be very wrong and I don't mind panic here.

@MariusVanDerWijden
Copy link
Member

Took a first look now and it looks pretty good so far. I added a few questions/things I didn't understand while reviewing. The biggest changes (generator, generator_test, context) are modified copies from the snapshot package

@rjl493456442 rjl493456442 force-pushed the snapshot-integration-p5 branch from 985a84e to 9635f1d Compare January 6, 2025 07:49
@rjl493456442 rjl493456442 force-pushed the snapshot-integration-p5 branch 2 times, most recently from 0245e41 to c556e77 Compare January 14, 2025 03:17
@fjl fjl added this to the 1.15.4 milestone Feb 25, 2025
@fjl fjl modified the milestones: 1.15.4, 1.15.5, 1.15.6 Mar 1, 2025
@rjl493456442 rjl493456442 force-pushed the snapshot-integration-p5 branch from cf0bbd6 to 87b17e7 Compare April 6, 2025 06:55
@MariusVanDerWijden MariusVanDerWijden added this to the 1.15.10 milestone Apr 23, 2025
@fjl fjl modified the milestones: 1.15.10, 1.15.11 Apr 25, 2025
@fjl fjl modified the milestones: 1.15.11, 1.15.12 May 5, 2025
@rjl493456442 rjl493456442 force-pushed the snapshot-integration-p5 branch from 87b17e7 to afecee4 Compare May 9, 2025 06:40
@rjl493456442 rjl493456442 force-pushed the snapshot-integration-p5 branch from 8feecf2 to b4d7413 Compare May 14, 2025 01:42
if dl.generator != nil {
dl.generator.stop()
progress = dl.generator.progressMarker()
log.Info("Terminated snapshot generation")
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Paused

Copy link
Member

@MariusVanDerWijden MariusVanDerWijden left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

fatal error: concurrent map iteration and map write

goroutine 3503687081 [running]:
internal/runtime/maps.fatal({0x1c0e410?, 0xc0d9bbe000?})
        runtime/panic.go:1053 +0x18
internal/runtime/maps.(*Iter).Next(0xc003b47280?)
        internal/runtime/maps/table.go:683 +0x86
github.com/ethereum/go-ethereum/triedb/pathdb.(*stateSet).accountList.Keys[...].func1()
        maps/iter.go:27 +0x71
slices.AppendSeq[...](...)
        slices/iter.go:50
slices.Collect[...](...)
        slices/iter.go:58
slices.SortedFunc[...](0xc003b473d0, 0x1f96be0)
        slices/iter.go:72 +0xd0
github.com/ethereum/go-ethereum/triedb/pathdb.(*stateSet).accountList(0xc003d26120)
        github.com/ethereum/go-ethereum/triedb/pathdb/states.go:179 +0x13a
github.com/ethereum/go-ethereum/triedb/pathdb.newDiffAccountIterator({0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, ...}, ...)
Copy link
Member

@MariusVanDerWijden MariusVanDerWijden left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@rjl493456442 rjl493456442 merged commit 892a661 into ethereum:master May 16, 2025
3 of 4 checks passed
Dargon789 pushed a commit to Dargon789/go-ethereum that referenced this pull request May 27, 2025
…thereum#30661)

In this pull request, snapshot generation in pathdb has been ported from 
the legacy state snapshot implementation. Additionally, when running in 
path mode, legacy state snapshot data is now managed by the pathdb
based snapshot logic.

Note: Existing snapshot data will be re-generated, regardless of whether 
it was previously fully constructed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants