Skip to content

Optimise .fill() throughput #43

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Feb 1, 2023
Merged

Conversation

Bluefinger
Copy link
Contributor

Looking at the fill method, I noticed there was some room for optimisation, and also removing unnecessary unwrap() from the hot loop. Also, the remainder portion was generating new entropy blocks for every u8 portion, when any remaining slice length would always be less than 8 bytes, or one u64 block (which is what WyRand generates per new state), Therefore, a maximum of seven u8 calls can be reduced to just one u64 block.

Afterwards, I simplified the copying of entropy to make use of copy_from_slice, since we know that either the slices will match (and need no try_into conversions), or that the length of the input will always be smaller than the target.

I then benchmarked the resulting code, on my AMD Ryzen 5 Pro 2500U (4C/8T 2,0Ghz base, 3.6Ghz max) with 16GB RAM.

Before:

running 2 tests
test fill             ... bench:          68 ns/iter (+/- 1)
test fill_naive       ... bench:         471 ns/iter (+/- 9)

After:

running 2 tests
test fill             ... bench:          64 ns/iter (+/- 1)
test fill_naive       ... bench:         485 ns/iter (+/- 15)

I was getting 68-70ns before, and now 64-65ns afterwards. Now, I added the #[inline] annotation out of curiosity and tested it, and was getting even more perf as a result:

After with #[inline]

running 2 tests
test fill             ... bench:          56 ns/iter (+/- 1)
test fill_naive       ... bench:         472 ns/iter (+/- 14)

So it might be advantageous to include it. So end result with all changes here is we've gone from 68-70ns to 64-65ns to then 56ns.

Copy link
Collaborator

@taiki-e taiki-e left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@taiki-e taiki-e merged commit 08f7d3c into smol-rs:master Feb 1, 2023
@notgull notgull mentioned this pull request Feb 12, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

2 participants