-
Notifications
You must be signed in to change notification settings - Fork 13.3k
Reform is_power_of_two() to return false for self == 0, and move previous functionality to is_power_of_two_or_zero() #19640
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Hmm, r? @bjz |
I've now added in the |
Thanks for this, but unfortunately we can't add a new method to this trait without going through the RFC process first. cc. @aturon |
There seems to be a much more efficient implementation of pub fn next_power_of_two<T: UnsignedInt>(x: T) -> T {
let bits = std::mem::size_of::<T>() * 8;
let one: T = Int::one();
one << ((bits - (x - one).leading_zeros()) % bits)
}
pub fn checked_next_power_of_two<T: UnsignedInt>(x: T) -> Option<T> {
let npot = next_power_of_two(x);
if npot >= x {
Some(npot)
} else {
None
}
}
#[test]
fn t() {
assert_eq!(next_power_of_two::<u64>(0), 1);
assert_eq!(next_power_of_two::<u64>(1), 1);
assert_eq!(next_power_of_two::<u64>(2), 2);
assert_eq!(next_power_of_two::<u64>(3), 4);
assert_eq!(next_power_of_two::<u64>(4), 4);
assert_eq!(next_power_of_two::<u64>(5), 8);
assert_eq!(next_power_of_two::<u64>(-1), 1);
assert_eq!(next_power_of_two::<u64>(-2), 1);
assert_eq!(next_power_of_two::<u64>(-3), 1);
assert_eq!(next_power_of_two::<u64>(-4), 1);
assert_eq!(next_power_of_two::<u64>(-5), 1);
}
|
The only concern I have about this overflow-unchecked implementation is that it suffers from the same downside as the original - the I personally wouldn't mind having overflow-checked arithmetic being the default, with unchecked being an opt-in, but that is a much broader issue. |
@aliblong there is a separate checked variant: http://doc.rust-lang.org/std/num/trait.UnsignedInt.html#tymethod.checked_next_power_of_two |
@gankro Yes, I know. I actually modified that function in the most recent commit. All I meant is that I would prefer the overflow-checked cases to be the semantic default ( Edit: On second thought, it's probably too cumbersome having to deal with |
I don't think we probably need @bjz do you agree? |
self.is_power_of_two_or_zero() && !(self == Int::zero()) | ||
} | ||
|
||
/// Returns `true` iff `self == 2^k` for some `k` or `self == 0`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
iff
-> `if`` (but it seems that this function will be removed anyways)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
iff
is strictly correct here (and used on e.g. the function directly above).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(To be clear, iff is a common shortening of "if and only if".)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, I guess I learned something new today.
It would make me kind of sad to call |
I suspect that inlining and optimization would make that disappear in the On Mon, Dec 15, 2014 at 9:53 PM, Aaron Liblong [email protected]
|
I agree with @aturon. The change in semantics for |
Okay, I removed the new method accordingly. |
Could you squash the commits? |
28b3fb5
to
98f7943
Compare
All squashed up! |
build failure looks spurious. However: @aliblong I would prefer to use @mahkoh's code since we use this functionality in our collections. Admittedly on super-slow paths (allocation), but why not use the fast one? If the user cares about overflow they can use the checked variant. Also if mahkoh's code is used, we need to update this code accordingly: http://doc.rust-lang.org/src/collections/vec.rs.html#684-692 It should just panic (capacity overflow) in the overflow case anyway, since that means you're already asking for more than |
Ok. Want me to push a new commit with this code, or wait for @mahkoh to submit a PR? |
Push a new commit. @mahkoh has demonstrated disinterest in submitting a PR on irc. |
Ok, will do. By the way, should I tag |
@aliblong I'm actually unclear on how default impls behave cross-crate. Might as well, we spam that directive everywhere anyway. :/ |
5deb72d
to
4f9a282
Compare
Okay, I implemented it. I was worried about a perf hit from modifying |
self.grow_capacity(cap) | ||
let amort_cap = new_cap.checked_next_power_of_two(); | ||
match amort_cap { | ||
None => self.grow_capacity(new_cap), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I noted in my discussion of this method, this should really just panic
, since new_cap must be larger than int::MAX
. As such you can just do unwrap("capacity overflow")
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't new_cap
guaranteed to be less than uint::MAX
given that it's the product of a checked_add
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can't allocate more than int::MAX contiguous memory because pointer arithmetic uses int
. If you did successfully make such an object, you wouldn't be able to properly index in it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, okay, I had no idea. I guess that's so you can deal with negative pointer offsets?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah. The underlying LLVM (OS?) API uses signed arithmetic, hands are tied.
fcea403
to
59701c7
Compare
Finished implementing @gankro's suggestions |
59701c7
to
0a98430
Compare
…nsider 0 not a power of 2. Vec panics when attempting to reserve capacity > int::MAX (uint::MAX / 2).
0a98430
to
f6328b6
Compare
Squashed the commits |
The `is_power_of_two()` method of the `UnsignedInt` trait currently returns `true` for `self == 0`. Zero is not a power of two, assuming an integral exponent `k >= 0`. I've therefore moved this functionality to the new method `is_power_of_two_or_zero()` and reformed `is_power_of_two()` to return false for `self == 0`. To illustrate the usefulness of the existence of both functions, consider `HashMap`. Its capacity must be zero or a power of two; conversely, it also requires a (non-zero) power of two for key and val alignment. Also, added a small amount of documentation regarding #18604.
I think this regressed performance of const BYTES: u64 = 100_000;
#[bench]
fn push_str_one_byte(b: &mut Bencher) {
b.bytes = BYTES;
b.iter(|| {
let mut s = String::new();
for _ in range(0, BYTES) {
s.push_str("a")
}
black_box(s);
});
}
|
I don’t know if this PR is actually responsible for the results above (I haven’t bisected it), it’s just one change that seem relevant. |
Ack. We did just replace a conditional set with a conditional panic. Perhaps we should replace the second CC @thestinger I'm a bit shaky on whose responsibility it is to "deal" with over-large allocations. Would it be sound for all of our collections code to saturate on allocation-size overflow as a sort of "poisoning" of the allocation? I believe you're opposed to us panicking for these runtime errors anyway, right? |
But we already had a conditional panic before… Is two worse than one? |
`String::push(&mut self, ch: char)` currently has a single code path that calls `Char::encode_utf8`. Perhaps it could be faster for ASCII `char`s, which are represented as a single byte in UTF-8. This commit leaves the method unchanged, adds a copy of it with the fast path, and adds benchmarks to compare them. Results show that the fast path very significantly improves the performance of repeatedly pushing an ASCII `char`, but does not significantly affect the performance for a non-ASCII `char` (where the fast path is not taken). Output of `make check-stage1-collections NO_REBUILD=1 PLEASE_BENCH=1 TESTNAME=string::tests::bench_push` ``` test string::tests::bench_push_char_one_byte ... bench: 59552 ns/iter (+/- 2132) = 167 MB/s test string::tests::bench_push_char_one_byte_with_fast_path ... bench: 6563 ns/iter (+/- 658) = 1523 MB/s test string::tests::bench_push_char_two_bytes ... bench: 71520 ns/iter (+/- 3541) = 279 MB/s test string::tests::bench_push_char_two_bytes_with_slow_path ... bench: 71452 ns/iter (+/- 4202) = 279 MB/s test string::tests::bench_push_str ... bench: 24 ns/iter (+/- 2) test string::tests::bench_push_str_one_byte ... bench: 38910 ns/iter (+/- 2477) = 257 MB/s ``` A benchmark of pushing a one-byte-long `&str` is added for comparison, but its performance [has varied a lot lately]( rust-lang#19640 (comment)). (When the input is fixed, `s.push_str("x")` could be used instead of `s.push('x')`.)
I can confirm that this change to |
`String::push(&mut self, ch: char)` currently has a single code path that calls `Char::encode_utf8`. This adds a fast path for ASCII `char`s, which are represented as a single byte in UTF-8. Benchmarks of stage1 libcollections at the intermediate commit show that the fast path very significantly improves the performance of repeatedly pushing an ASCII `char`, but does not significantly affect the performance for a non-ASCII `char` (where the fast path is not taken). ``` bench_push_char_one_byte 59552 ns/iter (+/- 2132) = 167 MB/s bench_push_char_one_byte_with_fast_path 6563 ns/iter (+/- 658) = 1523 MB/s bench_push_char_two_bytes 71520 ns/iter (+/- 3541) = 279 MB/s bench_push_char_two_bytes_with_slow_path 71452 ns/iter (+/- 4202) = 279 MB/s bench_push_str_one_byte 38910 ns/iter (+/- 2477) = 257 MB/s ``` A benchmark of pushing a one-byte-long `&str` is added for comparison, but its performance [has varied a lot lately](rust-lang#19640 (comment)). (When the input is fixed, `s.push_str("x")` could be used just as well as `s.push('x')`.)
Appears to have been a breaking change. |
@brson oops! Thought that had been handled. 😞 |
The
is_power_of_two()
method of theUnsignedInt
trait currently returnstrue
forself == 0
. Zero is not a power of two, assuming an integral exponentk >= 0
. I've therefore moved this functionality to the new methodis_power_of_two_or_zero()
and reformedis_power_of_two()
to return false forself == 0
.To illustrate the usefulness of the existence of both functions, consider
HashMap
. Its capacity must be zero or a power of two; conversely, it also requires a (non-zero) power of two for key and val alignment.Also, added a small amount of documentation regarding #18604.