Skip to content

feat: Optimise HDiffBuffer by using Arc instead of Vec #7675

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: unstable
Choose a base branch
from

Conversation

PoulavBhowmick03
Copy link
Contributor

@PoulavBhowmick03 PoulavBhowmick03 commented Jun 30, 2025

Issue Addressed

Fixes #6579

Proposed Changes

Modified HDiffBuffer to use Vex<Arc<Validator>> instead of Vec<Validator>, and made necessary changes regarding that

@PoulavBhowmick03 PoulavBhowmick03 changed the title feat: Optimise HDiffBuffer by using Arc instead of Vec feat: Optimise HDiffBuffer by using Arc instead of Vec Jun 30, 2025
Comment on lines 287 to 289
let mut validators_vec = source.validators.to_vec();
self.validators_diff().apply(&mut validators_vec, config)?;
source.validators = Arc::from(validators_vec);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see how using Arc<[Validator]> helps when we do this conversion to and from Vec<Validator> whenever we apply a diff.

Are you trying to save memory or CPU cycles?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right. i was trying to save memory as the issue mentioned, but converting to Vec<> fails that purpose.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh yeah I forgot about that issue, I think the structure I was imagining was Vec<Arc<Validator>>

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it. It has to be a collection of individual Arcs. Made a mistake on wrapping all the validators in a single Arc. Will be changing that

Comment on lines 186 to 187
let validators = std::mem::take(beacon_state.validators_mut()).to_vec();
let validators = Arc::from(validators);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we did want to save memory, a promising approach might be to keep the List, which is a copy-on-write (persistent) data structure

However, Lists have slower O(log n) indexing than vecs, and we do random-access indexing here:

if let Some(x) = xs.get_mut(index as usize) {

We also do it here, but this use could be replaced by a zip on two iterators:

let validator_diff = if let Some(x) = xs.get(i) {

@michaelsproul michaelsproul added waiting-on-author The reviewer has suggested changes and awaits thier implementation. database labels Jun 30, 2025
@PoulavBhowmick03 PoulavBhowmick03 force-pushed the arc/hdiffbuffer branch 2 times, most recently from 62f85a6 to 014e076 Compare June 30, 2025 11:49
Comment on lines +188 to +189
.cloned()
.map(Arc::new)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is maybe OK, but we could make it even better if we added an iter_arc method to List. In the case of validators, they are not packed in leaves, and are stored behind an Arc:

https://github.com/sigp/milhouse/blob/6da903ce8f783295714a68de4a2dcfc8b7c6ee01/src/leaf.rs#L17

Would you be interested in making a PR to milhouse to add this? I think the API would have to consider the case where T is packed, and return an error in this case. Something like:

pub fn iter_arc(&self) -> Result<impl Iterator<Item = &Arc<T>>, Error> {
   if T::tree_hash_type() == TreeHashType::Basic {
       // Can't return `Arc`s for packed leaves.
       return Err(Error::PackedLeavesNoArc);
   }
   // TODO
   Ok(iterator)
}

I'm happy to help advise on the implementation. Take a look at the other iterators in:

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure! I will look into it and try to implement the iter_arc method to List. Leaving a comment on the issue in Milhouse

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@michaelsproul michaelsproul added the optimization Something to make Lighthouse run more efficiently. label Jul 7, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
database optimization Something to make Lighthouse run more efficiently. waiting-on-author The reviewer has suggested changes and awaits thier implementation.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants