-
Notifications
You must be signed in to change notification settings - Fork 13.3k
Attempt to use the high part of the size_hint
in collect
(again)
#137908
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
@bors try @rust-timer queue |
This comment has been minimized.
This comment has been minimized.
…r=<try> Attempt to use the high part of the `size_hint` in `collect` (again) I last tried something like this [almost 7 years ago](rust-lang#53086); I wonder if it's more tolerable now...
This comment has been minimized.
This comment has been minimized.
☀️ Try build successful - checks-actions |
This comment has been minimized.
This comment has been minimized.
Finished benchmarking commit (272c07f): comparison URL. Overall result: ❌ regressions - no action neededBenchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf. @bors rollup=never Instruction countThis is the most reliable metric that we have; it was used to determine the overall result at the top of this comment. However, even this metric can sometimes exhibit noise.
Max RSS (memory usage)Results (primary 3.4%, secondary -2.0%)This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
CyclesResults (secondary -7.9%)This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
Binary sizeResults (primary 0.0%)This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
Bootstrap: 773.006s -> 772.399s (-0.08%) |
Interesting! Those perf results look entirely tolerable, way better than last time. Weird CI failure, though... EDIT: found it, #137919 |
This comment has been minimized.
This comment has been minimized.
let (low, high) = iterator.size_hint(); | ||
assert!( | ||
high.is_none_or(|high| low <= high), | ||
"size_hint ({low:?}, {high:?}) is malformed from iterator {} collecting into {}", | ||
core::any::type_name::<I>(), core::any::type_name::<Self>(), | ||
); | ||
|
||
let Some(first) = iterator.next() else { | ||
return Vec::new(); | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
next
and then size_hint
is better since some adapters can provide a better hint after the first step.
This comment has been minimized.
This comment has been minimized.
804cf26
to
76f763c
Compare
…acrum debug-assert that the size_hint is well-formed in `collect` Closes rust-lang#137919 In the hopes of helping to catch any future accidentally-incorrect rustc or stdlib iterators (like the ones rust-lang#137908 accidentally found), this has `Iterator::collect` call `size_hint` and check its `low` doesn't exceed its `Some(high)`. There's of course a bazillion more places this *could* be checked, but the hope is that this one is a good tradeoff of being likely to catch lots of things while having minimal maintenance cost (especially compared to putting it in *every* container's `from_iter`).
Rollup merge of rust-lang#138329 - scottmcm:assert-hint, r=Mark-Simulacrum debug-assert that the size_hint is well-formed in `collect` Closes rust-lang#137919 In the hopes of helping to catch any future accidentally-incorrect rustc or stdlib iterators (like the ones rust-lang#137908 accidentally found), this has `Iterator::collect` call `size_hint` and check its `low` doesn't exceed its `Some(high)`. There's of course a bazillion more places this *could* be checked, but the hope is that this one is a good tradeoff of being likely to catch lots of things while having minimal maintenance cost (especially compared to putting it in *every* container's `from_iter`).
…acrum debug-assert that the size_hint is well-formed in `collect` Closes rust-lang#137919 In the hopes of helping to catch any future accidentally-incorrect rustc or stdlib iterators (like the ones rust-lang#137908 accidentally found), this has `Iterator::collect` call `size_hint` and check its `low` doesn't exceed its `Some(high)`. There's of course a bazillion more places this *could* be checked, but the hope is that this one is a good tradeoff of being likely to catch lots of things while having minimal maintenance cost (especially compared to putting it in *every* container's `from_iter`).
let (low, high) = iterator.size_hint(); | ||
let Some(first) = iterator.next() else { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the other comment still applies that some size_hint
s may be better after the first next
-- or do you disagree?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've been meaning to come back and give that a shot. While it's certainly true, I'm also skeptical how valuable it is, since the usual case is things like flat_map
that almost never have a good hint anyway -- and when they do, like flattening an iterator over arrays, it doesn't need the first one. But can try it.
(It makes me tempted to have a next_with_suggested_reserve -> Option<(NonZero<usize>, Item)>
, too, but that's a bigger conversation.)
&& let extra = high - low | ||
&& extra < low | ||
{ | ||
high |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we cap this at isize::MAX
bytes? It's conceivable on smaller targets that some iterator with a low
near the edge isn't going to produce any more than that in practice, even if high
would be too much, so maybe it's a bad idea to force a capacity error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that it would only ever matter for things that produce exactly low
things. If it's just "near" the edge, it already panics today because it pre-allocates for low
, then the low+1
-th element tries to double it, and panics.
Said otherwise, this only uses the high
as the hint if push
ing could have tried to reserve that much anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I get that, but I still wonder if such "exactly low
" cases exist that we would be harming.
I feel like the doubling logic should cap itself too, but that's a separate conversation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if we only try the high capacity first when it's in (low, low*2], or else fall back to just low.
I last tried something like this almost 7 years ago; I wonder if it's more tolerable now...