-
Notifications
You must be signed in to change notification settings - Fork 9
Final report #773
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
dcoutts
wants to merge
24
commits into
main
Choose a base branch
from
dcoutts/final-report
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Final report #773
Changes from 4 commits
Commits
Show all changes
24 commits
Select commit
Hold shift + click to select a range
273d0e3
Add (hopefully) final draft of the final report
dcoutts c8060a3
Apply the same formatting to the integration notes as final report
dcoutts 09e573a
section in integration notes on security of hash based data structures
dcoutts a4b291e
Add the previous reports as references
dcoutts dcc18b8
Add references to the specification items with the targets
jeltsch 4db6e3b
Polish a bit
jeltsch 90fe4be
Revise the part on meeting the memory targets
jeltsch cea37c3
Elaborate on interactions with Mithril in integration notes
dcoutts a5f543b
Tweak final report title, and add subtitle
dcoutts f77131f
Minor edits to final report introduction
dcoutts 592a11e
Final report: add a changelog
jorisdral d45fd47
Revise the part on the upsert benchmarks
jeltsch 3a43516
Restore 80-columns layout for paragraphs with citations
jeltsch ed87915
Improve the formatting of the metadata source
jeltsch 1918726
Add @dcoutts to references as integration notes co-author
jeltsch ad91509
Restore spaces dropped by bibliography style
jeltsch 25e2c7d
Change `master` to `main` in GitHub URLs
jeltsch 7e96f24
Fix the URL of the API documentation
jeltsch d0b1533
Add (no-break) spaces before citation references
jeltsch 1201acc
Slightly improve the beginning of the introduction
jeltsch 9e93f86
Add integration notes section on possible file system incompatibility…
dcoutts 65a2c1c
Remove `locator` field from the bibliography
jeltsch 874d157
Revise the section on hashing and security
jeltsch cf64c3b
Revise the section on problems in connection with XFS
jeltsch File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
Large diffs are not rendered by default.
Oops, something went wrong.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,17 @@ | ||
<?xml version="1.0" encoding="utf-8"?> | ||
<style xmlns="http://purl.org/net/xbiblio/csl" version="1.0" default-locale="en-US"> | ||
<!-- Generated with https://github.com/citation-style-language/utilities/tree/master/generate_dependent_styles/data/ieee --> | ||
<info> | ||
<title>IEEE Software</title> | ||
<id>http://www.zotero.org/styles/ieee-software</id> | ||
<link href="http://www.zotero.org/styles/ieee-software" rel="self"/> | ||
<link href="http://www.zotero.org/styles/ieee" rel="independent-parent"/> | ||
<link href="http://ieeexplore.ieee.org/servlet/opac?punumber=52" rel="documentation"/> | ||
<category citation-format="numeric"/> | ||
<category field="engineering"/> | ||
<category field="communications"/> | ||
<issn>0740-7459</issn> | ||
<updated>2014-05-15T02:20:32+00:00</updated> | ||
<rights license="http://creativecommons.org/licenses/by-sa/3.0/">This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 License</rights> | ||
</info> | ||
</style> |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change | ||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
@@ -1,9 +1,25 @@ | ||||||||||||||||||
# Storing the Cardano ledger state on disk: integration notes for high-performance backend | ||||||||||||||||||
|
||||||||||||||||||
Authors: Joris Dral, Wolfgang Jeltsch | ||||||||||||||||||
Date: May 2025 | ||||||||||||||||||
|
||||||||||||||||||
## Sessions | ||||||||||||||||||
--- | ||||||||||||||||||
title: "Storing the Cardano ledger state on disk: | ||||||||||||||||||
integration notes for high-performance backend" | ||||||||||||||||||
author: | ||||||||||||||||||
- Duncan Coutts | ||||||||||||||||||
- Joris Dral | ||||||||||||||||||
- Wolfgang Jeltsch | ||||||||||||||||||
date: May 2025 | ||||||||||||||||||
|
||||||||||||||||||
toc: true | ||||||||||||||||||
numbersections: true | ||||||||||||||||||
classoption: | ||||||||||||||||||
- 11pt | ||||||||||||||||||
- a4paper | ||||||||||||||||||
geometry: | ||||||||||||||||||
- margin=2.5cm | ||||||||||||||||||
header-includes: | ||||||||||||||||||
- \usepackage{microtype} | ||||||||||||||||||
- \usepackage{mathpazo} | ||||||||||||||||||
--- | ||||||||||||||||||
|
||||||||||||||||||
# Sessions | ||||||||||||||||||
|
||||||||||||||||||
Creating new empty tables or opening tables from snapshots requires a `Session`. | ||||||||||||||||||
The session can be created using `openSession`, which has to be done in the | ||||||||||||||||||
|
@@ -15,7 +31,7 @@ Closing the session will automatically close all tables, but this is only | |||||||||||||||||
intended to be a backup functionality: ideally the user closes all tables | ||||||||||||||||||
manually. | ||||||||||||||||||
|
||||||||||||||||||
## The compact index | ||||||||||||||||||
# The compact index | ||||||||||||||||||
|
||||||||||||||||||
The compact index is a memory-efficient data structure that maintains serialised | ||||||||||||||||||
keys. Rather than storing full keys, it only stores the first 64 bits of each | ||||||||||||||||||
|
@@ -60,7 +76,7 @@ keys is as good as any other total ordering. However, the consensus layer will | |||||||||||||||||
face the situation where a range lookup or a cursor read returns key–value pairs | ||||||||||||||||||
slightly out of order. Currently, we do not expect this to cause problems. | ||||||||||||||||||
|
||||||||||||||||||
## Snapshots | ||||||||||||||||||
# Snapshots | ||||||||||||||||||
|
||||||||||||||||||
Snapshots currently require support for hard links. This means that on Windows | ||||||||||||||||||
the library only works when using NTFS. Support for other file systems could be | ||||||||||||||||||
|
@@ -84,7 +100,7 @@ a cheaper non-SSD drive. This feature was unfortunately not anticipated in the | |||||||||||||||||
project specification and so is not currently included. As discussed above, it | ||||||||||||||||||
could be added with some additional work. | ||||||||||||||||||
|
||||||||||||||||||
## Value resolving | ||||||||||||||||||
# Value resolving | ||||||||||||||||||
|
||||||||||||||||||
When instantiating the `ResolveValue` class, it is usually advisable to | ||||||||||||||||||
implement `resolveValue` such that it works directly on the serialised values. | ||||||||||||||||||
|
@@ -94,7 +110,7 @@ function is intended to work like `(+)`, then `resolveValue` could add the raw | |||||||||||||||||
bytes of the serialised values and would likely achieve better performance this | ||||||||||||||||||
way. | ||||||||||||||||||
|
||||||||||||||||||
## `io-classes` incompatibility | ||||||||||||||||||
# `io-classes` incompatibility | ||||||||||||||||||
|
||||||||||||||||||
At the time of writing, various packages in the `cardano-node` stack depend on | ||||||||||||||||||
`io-classes-1.5` and the 1.5-versions of its daughter packages, like | ||||||||||||||||||
|
@@ -124,3 +140,50 @@ It is known to us that the `ouroboros-consensus` stack has not been updated to | |||||||||||||||||
https://github.com/IntersectMBO/ouroboros-network/pull/4951. We would advise to | ||||||||||||||||||
fix this Nix-related bug rather than downgrading `lsm-tree`’s dependency on | ||||||||||||||||||
`io-classes` to version 1.5. | ||||||||||||||||||
|
||||||||||||||||||
# Security of hash based data structures | ||||||||||||||||||
|
||||||||||||||||||
Data structures based on hashing have to be considered carefully when they may | ||||||||||||||||||
be used with untrusted data. If the attacker can control the keys in a hash | ||||||||||||||||||
table for example, they may be able to arrange for all their keys to have hash | ||||||||||||||||||
collisions which may cause unexpected performance problems. This is why the | ||||||||||||||||||
Haskell Cardano node implementation does not use hash tables, and uses | ||||||||||||||||||
ordering-based containers instead (such as `Data.Map`). | ||||||||||||||||||
|
||||||||||||||||||
The Bloom filters in an LSM tree are hash based data structures. For performance | ||||||||||||||||||
they do not use cryptographic hashes. So in principle it would be possibile for | ||||||||||||||||||
an attacker to arrange that all their keys hash to a common set of bits. This | ||||||||||||||||||
would be a potential problem for the UTxO and other stake related tables in | ||||||||||||||||||
Cardano, since it is the users that get to pick (with old modest grinding | ||||||||||||||||||
difficulty) their UTxO keys (TxIn) and stake keys (verification key hashes). It | ||||||||||||||||||
would be even more serious if an attacker can grind their set of malicious keys | ||||||||||||||||||
locally, in the knowledge that the same set of keys will hash the same way on | ||||||||||||||||||
all other Cardano nodes. | ||||||||||||||||||
|
||||||||||||||||||
This issue was not considered in the original project specification, but we | ||||||||||||||||||
have considered it and included a mitigation. The mitigation is that on the | ||||||||||||||||||
initial creation of a lsm-tree session, a random salt is conjured (from | ||||||||||||||||||
`/dev/random`) and stored persistenly as part of the session. This salt is then | ||||||||||||||||||
used as part of the Bloom filter hashing for all runs in all tables in the | ||||||||||||||||||
session. | ||||||||||||||||||
|
||||||||||||||||||
The result is that while it is in principle still possible to produce hash | ||||||||||||||||||
collisions in the Bloom filter, this now depends on knowing the salt. And now | ||||||||||||||||||
every node has a different salt. So a system wide attack becomes impossible; | ||||||||||||||||||
instead it is only plausible to target individual nodes. Discovering a node's | ||||||||||||||||||
salt would also be impractically difficult. In principle there is a timing | ||||||||||||||||||
side channel, in that collisions will cause more I/O and thus take longer. | ||||||||||||||||||
An attacker would need to get upstream of a victim node, supply a valid block | ||||||||||||||||||
and measure the timing of receiving the block downstream. There is however a | ||||||||||||||||||
large amount of noise. | ||||||||||||||||||
|
||||||||||||||||||
Overall, our judgement is that this mitigation is practically sufficient, but | ||||||||||||||||||
it merits a securit review from others who may make a different judgement. It | ||||||||||||||||||
is also worth noting that this issue may occur in other LSM-trees used in other | ||||||||||||||||||
Cardano and non-Cardano implementations. In particular, RocksDB does not appear | ||||||||||||||||||
to use a salt at all. | ||||||||||||||||||
|
||||||||||||||||||
Note that a per-run or per-table hash salt would incur non-trivial costs, | ||||||||||||||||||
because it would reduce the sharing available in bulk Bloom filter lookups | ||||||||||||||||||
(looking up N keys in M filters). The Bloom filter lookup is a performance | ||||||||||||||||||
sensitive part of the overall database implementation. | ||||||||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Just to make it extra clear
Suggested change
|
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"securit" -> "security"