Skip to content

Added crypto #5

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
May 12, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
125 changes: 124 additions & 1 deletion Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

10 changes: 7 additions & 3 deletions Cargo.toml
Original file line number Diff line number Diff line change
@@ -1,13 +1,17 @@
[package]
name = "stable-hash"
version = "0.1.0"
version = "0.2.0"
authors = ["Zac Burns <[email protected]>"]
edition = "2018"

# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html

[dependencies]
blake3 = "0.3.3"
num-traits = "0.2.11"
leb128 = "0.2.4"
num-bigint = "0.2.6"
lazy_static = "1.4.0"

[dev-dependencies]
twox-hash = "1.5.0"
twox-hash = "1.5.0"
rustc-hex = "2.1.0"
48 changes: 48 additions & 0 deletions src/crypto/blake3_sequence.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
use crate::prelude::*;
use blake3::{Hasher, OutputReader};
use leb128::write::unsigned as write_varint;
use std::convert::TryInto as _;
use std::num::NonZeroUsize;

#[derive(Clone)]
pub struct Blake3SeqNo {
hasher: Hasher,
// This has to be NonZero in order to be injective, since the payload marker writes 0
// See also 91e48829-7bea-4426-971a-f092856269a5
child: NonZeroUsize,
}

impl SequenceNumber for Blake3SeqNo {
fn root() -> Self {
Self {
hasher: Hasher::new(),
child: NonZeroUsize::new(1).unwrap(),
}
}
fn next_child(&mut self) -> Self {
let child = self.child;
let mut hasher = self.hasher.clone();
// Better to panic than overflow.
self.child = NonZeroUsize::new(child.get() + 1).unwrap();
// Include the child node
write_varint(&mut hasher, child.get().try_into().unwrap()).unwrap();
Self {
hasher,
child: NonZeroUsize::new(1).unwrap(),
}
}
#[inline]
fn skip(&mut self, count: usize) {
self.child = NonZeroUsize::new(self.child.get() + count).unwrap();
}
}

impl Blake3SeqNo {
pub(crate) fn finish(self, payload: &[u8]) -> OutputReader {
let Self { mut hasher, .. } = self;
// See also 91e48829-7bea-4426-971a-f092856269a5
hasher.update(&[0]);
hasher.update(payload);
hasher.finalize_xof()
}
}
5 changes: 5 additions & 0 deletions src/crypto/mod.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
mod blake3_sequence;
mod set_hasher;

pub use blake3_sequence::Blake3SeqNo;
pub use set_hasher::SetHasher;
100 changes: 100 additions & 0 deletions src/crypto/set_hasher.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,100 @@
use super::blake3_sequence::Blake3SeqNo;
use crate::prelude::*;
use crate::stable_hash::UnorderedAggregator;
use blake3::Hasher;
use lazy_static::lazy_static;
use num_bigint::BigUint;
use num_traits::identities::One;
use std::default::Default;

lazy_static! {
static ref P: BigUint = "50763434429823703141085322590076158163032399096130816327134180611270739679038131809123861970975131471260684737408234060876742190838745219274061025048845231234136148410311444604554192918702297959809128216170781389312847013812749872750274650041183009144583521632294518996531883338553737214586176414455965584933129379474747808392433032576309945590584603359054260866543918929486383805924215982747035136255123252119828736134723149397165643360162699752374292974151421555939481822911026769138419707577501643119472226283015793622652706604535623136902831581637275314074553942039263472515423713366344495524733341031029964603383".parse().unwrap();
}

/// Based on https://crypto.stackexchange.com/a/54546
///
/// The idea here is to use the SequenceNumber to unambiguously identify each
/// field as within it's own database cell, and use an online order-independent
/// aggregator of the cells to produce a final result.
///
/// Within this framework a huge struct can be hashed incrementally or even in
/// parallel as long as sequence numbers are deterministically produced to
/// identify parts within the struct. Conveniently, the SequenceNumber::skip
/// method can be used to jump to parts of a vec or struct efficiently.
pub struct SetHasher {
// TODO: (Performance). We want an int 2056 + 2048 = 4104 bit int.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you saying that you plan to implement a crate / traits that can easily give 4104 bit ints, and once this is done, the set hash implementation should be more performant?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. The current choice num-bigint::BigUint requires multiple variable-length heap allocations for every mixin. It is possible to have a large int be allocated on the stack for a quick and easy performance gain.

// That's enough to handle any sequence of mixin operations without overflow.
// https://github.com/paritytech/parity-common/issues/388
// Not a bad idea to start here so that when we convert we know that the transformation is ok.
value: BigUint,
}

impl Default for SetHasher {
fn default() -> Self {
Self {
value: BigUint::one(),
}
}
}

impl SetHasher {
#[inline]
pub fn new() -> Self {
Default::default()
}
#[inline]
fn mixin(&mut self, digits: &BigUint) {
self.value = (&self.value * digits) % &*P;
}
pub fn to_bytes(&self) -> Vec<u8> {
self.value.to_bytes_le()
}
/// Panics if the bytes are not in a valid format.
/// The only valid values are values returned from to_bytes()
pub fn from_bytes(bytes: &[u8]) -> Self {
assert!(bytes.len() <= 257);
let value = BigUint::from_bytes_le(bytes);
Self { value }
}
}

/// The SetHasher is already updated in an unordered fashion, so no special second struct
/// is needed. Starts at 1 and mixin when finished.
impl UnorderedAggregator<Blake3SeqNo> for SetHasher {
#[inline]
fn write(&mut self, value: impl StableHash, sequence_number: Blake3SeqNo) {
value.stable_hash(sequence_number, self)
}
}

impl StableHasher for SetHasher {
type Out = [u8; 32];
type Seq = Blake3SeqNo;
type Unordered = Self;
fn write(&mut self, sequence_number: Self::Seq, bytes: &[u8]) {
// Write the field into a database cell
let mut output = sequence_number.finish(bytes);
// Extend to the length necessary. This is a 2048 bit value, 1 bit
// less than the prime the hash wraps around.
let mut digits = [0u8; 256];
output.fill(&mut digits);
let digits = BigUint::from_bytes_le(&digits);
// Add the value to the database
self.mixin(&digits)
}
#[inline]
fn start_unordered(&mut self) -> Self::Unordered {
Self::new()
}
#[inline]
fn finish_unordered(&mut self, unordered: Self::Unordered, _sequence_number: Self::Seq) {
self.mixin(&unordered.value)
}
fn finish(&self) -> Self::Out {
// Re-mix the state with a Hasher.
let mut hasher = Hasher::new();
let le = self.value.to_bytes_le();
hasher.update(&le);
hasher.finalize().into()
}
}
2 changes: 1 addition & 1 deletion src/impls/bool.rs
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
use crate::prelude::*;

impl StableHash for bool {
fn stable_hash(&self, sequence_number: impl SequenceNumber, state: &mut impl StableHasher) {
fn stable_hash<H: StableHasher>(&self, sequence_number: H::Seq, state: &mut H) {
if *self {
state.write(sequence_number, &[]);
}
Expand Down
4 changes: 2 additions & 2 deletions src/impls/hash_map.rs
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@ use crate::prelude::*;
use std::collections::HashMap;

impl<K: StableHash, V: StableHash, S> StableHash for HashMap<K, V, S> {
fn stable_hash(&self, sequence_number: impl SequenceNumber, state: &mut impl StableHasher) {
super::unordered_stable_hash(self.iter(), sequence_number, state)
fn stable_hash<H: StableHasher>(&self, sequence_number: H::Seq, state: &mut H) {
super::unordered_unique_stable_hash(self.iter(), sequence_number, state)
}
}
Loading