-
Notifications
You must be signed in to change notification settings - Fork 156
feat: Add firewood as a database #1029
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
0506448
to
b9ac3fa
Compare
core/state/database.go
Outdated
case *firewood.AccountTrie: | ||
return t.Copy() | ||
case *firewood.StorageTrie: | ||
return nil // The storage trie just wraps the account trie, so we don't need to copy it separately. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should not we still call t.Copy (or t.AccountTrie.Copy) though? (bc AccountTrie is not an interface but struct)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can't - all storage tries just point to the same account trie. If we copy the account trie (all the real information) and then copy the storage trie (just the pointer), then how will they know what account trie to use? They really should be re-allocated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The only reason there's a storage trie at all in the first place is this interface - because we don't need to explicitly hash the storage tries (firewood does that internally), we only need the 1-tier trie
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see but I sense there is an invariant that this depends on:
an AccountTrie for the related StorageTrie will always be copied. If that so we should comment it somewhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added some comments (and an explicit method).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could still use the Copy
method and return nil, but I think this is more clear.
) | ||
|
||
var ( | ||
_ Database = (*firewoodAccessorDb)(nil) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure why this check is needed. Since the first member of FirewoodAccessorDb is Database
already, isn't it 100% guaranteed to be true?
Is this just a guard to prevent someone from removing Database
from firewoodAccessorDb`?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pretty much. My understanding is that this is good practice in general.
// It checks the rawdb package for HashDB or PathDB schemes. | ||
// If it's neither of those, it checks to see if the string is set to FirewoodScheme. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These comments don't seem to match the code, which checks for FirewoodScheme first, and then just delegates to ParseStateScheme, which might have it's own restrictions, possibly obsoleting this comment.
Suggest renaming to CheckForMismatchedStorageScheme
or explaining why this is a bad name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a perfectly reasonable name, but I think it should attempt to match the eth-specific name, like our other overloaded methods from libevm in these custom*
packages. It does exactly the same thing as ParseStateScheme
, except also checks for Firewood
triedb/firewood/account_trie.go
Outdated
// If the account has been deleted, we should return nil | ||
if val, exists := a.dirtyKeys[string(accountKey)]; exists && len(val) == 0 { | ||
return nil, nil | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like we can do this once we have the accountKey on line 88
return nil, nil | ||
} | ||
// Decode and return the updated storage value | ||
_, decoded, _, err := rlp.Split(updateValue) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't check if rlp.Split returns a list and always assume it's a byte array. Not sure if this is okay or not, but it does seem strange.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The byte array is provided. This is following the example in libevm's trie/secure_trie.go
a.dirtyKeys[string(combinedKey[:])] = data | ||
a.updateKeys = append(a.updateKeys, combinedKey[:]) | ||
a.updateValues = append(a.updateValues, data) | ||
a.hasChanges = true // Mark that there are changes to commit |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why can't this be inferred from a.DirtyKeys.len()?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was a "small" optimization to avoid rehashing if we haven't made any changes. This is called 2-3 times every block without any additional changes, so this is just to avoid the call. As Cey noted, the dirty keys are never reset, and instead pile on top of each other. If we ever implement the AddToProposal
API (which creates a proposal and commits on top of it in memory, tracking to ensure we drop properly), then we could just use the length
// We already have this proposal, but should create a new context with the correct hash. | ||
// This solves the case of a unique block hash, but the same underlying proposal. | ||
if _, exists := existing.Parent.Hashes[parentHash]; exists { | ||
log.Debug("firewood: proposal already exists, updating hash", "root", root.Hex(), "parent", parentRoot.Hex(), "block", block, "hash", hash.Hex()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
might want a metric here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could, but it really shouldn't ever happen. This is just technically permitted, so it should be enforced
triedb/firewood/database.go
Outdated
db.proposalLock.Lock() | ||
defer db.proposalLock.Unlock() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this too early? Can't we wait until we called ExtractTrieDBUpdatePayload?
Also when is this released? Can't we release it while firewood is processing the proposal?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I could move it a couple lines later.
However, we have to wait to release this until the end of the function. If we are proposing from a proposal during a commit, we have a race, so until we address that deeper in Firewood, we need to exclusively guarantee that only one of these operations is happening
Co-authored-by: Ceyhun Onur <[email protected]> Signed-off-by: Austin Larson <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
few nits and more cleanups + Initialized
seems always returning true
even for "uninitialized" dbs.
|
||
type Config struct { | ||
FilePath string // Path to the directory where the Firewood database will be stored | ||
FileName string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we use filepath
as full filepath rather than taking filename+filepath?
|
||
type Database struct { | ||
fwDisk *ffi.Database // The underlying Firewood database, used for storing proposals and revisions. | ||
ethdb ethdb.Database // The underlying disk database, used for storing genesis and the path. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
seems this can be removed
config = Defaults | ||
} | ||
|
||
fwConfig, path, err := validateConfig(config) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this more looks like a getPath
func (db *Database) Initialized(root common.Hash) bool { | ||
// If the database is open, then the file is necessarily initialized. | ||
// TODO: This would be more informative to have a flag to check that we've state synced later. | ||
return db.fwDisk != nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why don't we check fw.Root() != common.Hash
?, this looks like we actually always return Initialized=true
since fwDisk is never nil after New
but if we haven't commited anything fw.Root
still can be empty (and actually not initialized)
Why this should be merged
This adds much of the Firewood state database implementation, without adding it as a configurable option in coreth.
How this works
This reimplements everything with
core/state/database.go
by adding the Firewood backend for both reading and hashing from the tries (theTrie
implementation) and for writing changes (thetriedb.Backend
implementation).Note: starting this PR, Windows can no longer be built on coreth.
How this was tested
This is broken off from #959, which is currently bootstrapping to mainnet and Fuji. Additionally, there is a small amount of fuzz testing to ensure that hashing, reading, and committing works as expected.
Need to be documented?
No.
Need to update RELEASES.md?
No - this code is not yet available to consumers.