Attempt to use the high part of the `size_hint` in `collect` (again) #137908

scottmcm · 2025-03-03T02:51:11Z

I last tried something like this almost 7 years ago; I wonder if it's more tolerable now...

rustbot · 2025-03-03T02:51:15Z

rustbot has assigned @cuviper.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

scottmcm · 2025-03-03T02:52:21Z

@bors try @rust-timer queue

…r=<try> Attempt to use the high part of the `size_hint` in `collect` (again) I last tried something like this [almost 7 years ago](rust-lang#53086); I wonder if it's more tolerable now...

bors · 2025-03-03T02:53:31Z

⌛ Trying commit 76f763c with merge 272c07f...

bors · 2025-03-03T04:53:45Z

☀️ Try build successful - checks-actions
Build commit: 272c07f (272c07f89984286198fbfbda53502c20946bf526)

rust-timer · 2025-03-03T06:09:32Z

Finished benchmarking commit (272c07f): comparison URL.

Overall result: ❌ regressions - no action needed

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

@bors rollup=never
@rustbot label: -S-waiting-on-perf -perf-regression

Instruction count

This is the most reliable metric that we have; it was used to determine the overall result at the top of this comment. However, even this metric can sometimes exhibit noise.

	mean	range	count
Regressions ❌ (primary)	0.5%	[0.5%, 0.5%]	1
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	0.5%	[0.5%, 0.5%]	1

Max RSS (memory usage)

Results (primary 3.4%, secondary -2.0%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	3.4%	[2.3%, 5.4%]	3
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-2.0%	[-2.6%, -1.4%]	2
All ❌✅ (primary)	3.4%	[2.3%, 5.4%]	3

Cycles

Results (secondary -7.9%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-7.9%	[-8.6%, -7.1%]	2
All ❌✅ (primary)	-	-	0

Binary size

Results (primary 0.0%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	0.0%	[0.0%, 0.2%]	13
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-0.1%	[-0.1%, -0.0%]	3
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	0.0%	[-0.1%, 0.2%]	16

Bootstrap: 773.006s -> 772.399s (-0.08%)
Artifact size: 361.95 MiB -> 361.93 MiB (-0.01%)

scottmcm · 2025-03-03T06:27:22Z

Interesting! Those perf results look entirely tolerable, way better than last time.

Weird CI failure, though...

EDIT: found it, #137919

the8472 · 2025-03-03T10:07:11Z

library/alloc/src/vec/spec_from_iter_nested.rs

+        let (low, high) = iterator.size_hint();
+        assert!(
+            high.is_none_or(|high| low <= high),
+            "size_hint ({low:?}, {high:?}) is malformed from iterator {} collecting into {}",
+            core::any::type_name::<I>(), core::any::type_name::<Self>(),
+        );
+
+        let Some(first) = iterator.next() else {
+            return Vec::new();
+        };


next and then size_hint is better since some adapters can provide a better hint after the first step.

…acrum debug-assert that the size_hint is well-formed in `collect` Closes rust-lang#137919 In the hopes of helping to catch any future accidentally-incorrect rustc or stdlib iterators (like the ones rust-lang#137908 accidentally found), this has `Iterator::collect` call `size_hint` and check its `low` doesn't exceed its `Some(high)`. There's of course a bazillion more places this *could* be checked, but the hope is that this one is a good tradeoff of being likely to catch lots of things while having minimal maintenance cost (especially compared to putting it in *every* container's `from_iter`).

Rollup merge of rust-lang#138329 - scottmcm:assert-hint, r=Mark-Simulacrum debug-assert that the size_hint is well-formed in `collect` Closes rust-lang#137919 In the hopes of helping to catch any future accidentally-incorrect rustc or stdlib iterators (like the ones rust-lang#137908 accidentally found), this has `Iterator::collect` call `size_hint` and check its `low` doesn't exceed its `Some(high)`. There's of course a bazillion more places this *could* be checked, but the hope is that this one is a good tradeoff of being likely to catch lots of things while having minimal maintenance cost (especially compared to putting it in *every* container's `from_iter`).

…acrum debug-assert that the size_hint is well-formed in `collect` Closes rust-lang#137919 In the hopes of helping to catch any future accidentally-incorrect rustc or stdlib iterators (like the ones rust-lang#137908 accidentally found), this has `Iterator::collect` call `size_hint` and check its `low` doesn't exceed its `Some(high)`. There's of course a bazillion more places this *could* be checked, but the hope is that this one is a good tradeoff of being likely to catch lots of things while having minimal maintenance cost (especially compared to putting it in *every* container's `from_iter`).

cuviper · 2025-04-08T23:58:51Z

library/alloc/src/vec/spec_from_iter_nested.rs

+        let (low, high) = iterator.size_hint();
+        let Some(first) = iterator.next() else {


I think the other comment still applies that some size_hints may be better after the first next -- or do you disagree?

I've been meaning to come back and give that a shot. While it's certainly true, I'm also skeptical how valuable it is, since the usual case is things like flat_map that almost never have a good hint anyway -- and when they do, like flattening an iterator over arrays, it doesn't need the first one. But can try it.

(It makes me tempted to have a next_with_suggested_reserve -> Option<(NonZero<usize>, Item)>, too, but that's a bigger conversation.)

cuviper · 2025-04-09T00:03:36Z

library/alloc/src/vec/spec_from_iter_nested.rs

+                && let extra = high - low
+                && extra < low
+            {
+                high


Should we cap this at isize::MAX bytes? It's conceivable on smaller targets that some iterator with a low near the edge isn't going to produce any more than that in practice, even if high would be too much, so maybe it's a bad idea to force a capacity error.

Note that it would only ever matter for things that produce exactly low things. If it's just "near" the edge, it already panics today because it pre-allocates for low, then the low+1-th element tries to double it, and panics.

Said otherwise, this only uses the high as the hint if pushing could have tried to reserve that much anyway.

Yeah, I get that, but I still wonder if such "exactly low" cases exist that we would be harming.

I feel like the doubling logic should cap itself too, but that's a separate conversation.

What if we only try the high capacity first when it's in (low, low*2], or else fall back to just low.

Attempt to use the high part of the size_hint in collect

76f763c

rustbot assigned cuviper Mar 3, 2025

rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-libs Relevant to the library team, which will review and decide on the PR/issue. labels Mar 3, 2025

This comment has been minimized.

Sign in to view

rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Mar 3, 2025

This comment has been minimized.

Sign in to view

rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Mar 3, 2025

This comment has been minimized.

Sign in to view

the8472 reviewed Mar 3, 2025

View reviewed changes

scottmcm closed this Mar 5, 2025

scottmcm reopened this Mar 5, 2025

This comment has been minimized.

Sign in to view

scottmcm force-pushed the another-size-hint-attempt branch from 804cf26 to 76f763c Compare March 7, 2025 04:47

scottmcm mentioned this pull request Mar 11, 2025

debug-assert that the size_hint is well-formed in collect #138329

Merged

cuviper reviewed Apr 9, 2025

View reviewed changes

scottmcm added the S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. label Apr 9, 2025

cuviper removed the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Apr 14, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Attempt to use the high part of the `size_hint` in `collect` (again) #137908

Attempt to use the high part of the `size_hint` in `collect` (again) #137908

scottmcm commented Mar 3, 2025

rustbot commented Mar 3, 2025

scottmcm commented Mar 3, 2025

This comment has been minimized.

bors commented Mar 3, 2025

This comment has been minimized.

bors commented Mar 3, 2025

This comment has been minimized.

rust-timer commented Mar 3, 2025

scottmcm commented Mar 3, 2025 •

edited

Loading

This comment has been minimized.

the8472 Mar 3, 2025

This comment has been minimized.

cuviper Apr 8, 2025

scottmcm Apr 9, 2025

cuviper Apr 9, 2025

scottmcm Apr 9, 2025

cuviper Apr 9, 2025

cuviper Apr 9, 2025

		let (low, high) = iterator.size_hint();
		let Some(first) = iterator.next() else {

Attempt to use the high part of the size_hint in collect (again) #137908

Are you sure you want to change the base?

Attempt to use the high part of the size_hint in collect (again) #137908

Conversation

scottmcm commented Mar 3, 2025

rustbot commented Mar 3, 2025

scottmcm commented Mar 3, 2025

This comment has been minimized.

bors commented Mar 3, 2025

This comment has been minimized.

bors commented Mar 3, 2025

This comment has been minimized.

rust-timer commented Mar 3, 2025

Overall result: ❌ regressions - no action needed

scottmcm commented Mar 3, 2025 • edited Loading

This comment has been minimized.

Choose a reason for hiding this comment

This comment has been minimized.

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Attempt to use the high part of the `size_hint` in `collect` (again) #137908

Attempt to use the high part of the `size_hint` in `collect` (again) #137908

scottmcm commented Mar 3, 2025 •

edited

Loading