Skip to content

obligations_for_self_ty: skip irrelevant goals#146759

Closed
lcnr wants to merge 2 commits intorust-lang:mainfrom
lcnr:obligations_for_self_ty-perf
Closed

obligations_for_self_ty: skip irrelevant goals#146759
lcnr wants to merge 2 commits intorust-lang:mainfrom
lcnr:obligations_for_self_ty-perf

Conversation

@lcnr
Copy link
Copy Markdown
Contributor

@lcnr lcnr commented Sep 19, 2025

View all comments

Reduces the compile time of wg-grammar from more than 70s to about 40s. So a >30% perf improvement for that crate.

r? @BoxyUwU @compiler-errors

@rustbot
Copy link
Copy Markdown
Collaborator

rustbot commented Sep 19, 2025

Some changes occurred to the core trait solver

cc @rust-lang/initiative-trait-system-refactor

changes to inspect_obligations.rs

cc @compiler-errors, @lcnr

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. WG-trait-system-refactor The Rustc Trait System Refactor Initiative (-Znext-solver) labels Sep 19, 2025
@lcnr
Copy link
Copy Markdown
Contributor Author

lcnr commented Sep 19, 2025

@bors try @rust-timer queue

@rust-timer

This comment has been minimized.

@rust-bors

This comment has been minimized.

rust-bors Bot added a commit that referenced this pull request Sep 19, 2025
obligations_for_self_ty: skip irrelevant goals
@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Sep 19, 2025
@lcnr
Copy link
Copy Markdown
Contributor Author

lcnr commented Sep 19, 2025

@bors try @rust-timer queue

@rust-timer

This comment has been minimized.

@rust-bors

This comment has been minimized.

rust-bors Bot added a commit that referenced this pull request Sep 19, 2025
obligations_for_self_ty: skip irrelevant goals
@rust-bors
Copy link
Copy Markdown
Contributor

rust-bors Bot commented Sep 19, 2025

☀️ Try build successful (CI)
Build commit: ba651ad (ba651ad1746c1435192a5568cd8732d36f8536ce, parent: 2f4dfc753fd86c672aa4145940db075a8a149f17)

@rust-timer

This comment has been minimized.

@rust-timer
Copy link
Copy Markdown
Collaborator

Finished benchmarking commit (ba651ad): comparison URL.

Overall result: ✅ improvements - no action needed

Benchmarking this pull request means it may be perf-sensitive – we'll automatically label it not fit for rolling up. You can override this, but we strongly advise not to, due to possible changes in compiler perf.

@bors rollup=never
@rustbot label: -S-waiting-on-perf -perf-regression

Instruction count

Our most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
-0.6% [-1.0%, -0.3%] 7
All ❌✅ (primary) - - 0

Max RSS (memory usage)

This benchmark run did not return any relevant results for this metric.

Cycles

Results (secondary -3.0%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
-3.0% [-3.8%, -2.2%] 2
All ❌✅ (primary) - - 0

Binary size

Results (secondary 0.0%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
0.0% [0.0%, 0.0%] 1
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) - - 0

Bootstrap: 470.95s -> 473.304s (0.50%)
Artifact size: 389.99 MiB -> 390.01 MiB (0.01%)

@rustbot rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Sep 19, 2025
Comment on lines +79 to +83
let sub_root_var = self.sub_unification_table_root_var(self_ty);
let obligations = self
.fulfillment_cx
.borrow()
.pending_obligations_potentially_referencing_sub_root(sub_root_var);
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

comment

Comment thread compiler/rustc_trait_selection/src/solve/fulfill.rs
@BoxyUwU
Copy link
Copy Markdown
Member

BoxyUwU commented Sep 24, 2025

Is this like, important, or is it "just" a small perf win?

@BoxyUwU BoxyUwU added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Sep 24, 2025
@lcnr lcnr force-pushed the obligations_for_self_ty-perf branch from 452fdbf to 9926fa3 Compare October 1, 2025 14:52
@lcnr
Copy link
Copy Markdown
Contributor Author

lcnr commented Oct 13, 2025

@rustbot ready

see pr descr

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Oct 13, 2025
Copy link
Copy Markdown
Member

@BoxyUwU BoxyUwU left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel relatively uneasy about this. I don't like that this optimization makes obligations_for_self_ty wrong if try_evaluate_obligations hasn't been previously called in order to update all the stalled_on vars.

If this were only a theoretical issue it'd still be pretty bad since I don't think we'd be able to expect it to not happen in practice in the long term. In practice it should be possible for unsize coercion to hit this with its custom obligation evaluation loop and inability to call try_evaluate_goals.

I think the only way I could approve this if I thought it was ok to have obligations_for_self_ty have incorrect behaviour in some edge cases and I think that is not the case.

I'm not sure what the right fix is here given that unsize coercion does actually hit this in theory, so asserting that we're always calling obligations_for_self_ty in "good" cases doesn't actually work ^^'

View changes since this review

@rustbot rustbot added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Oct 13, 2025
@lcnr
Copy link
Copy Markdown
Contributor Author

lcnr commented Oct 13, 2025

I think the only way I could approve this if I thought it was ok to have obligations_for_self_ty have incorrect behaviour in some edge cases and I think that is not the case.

It is totally okay for obligations_for_self_ty to have incorrect behavior, e.g. we currently bail at depth 4 or 8 and don't recur any deeper.

@rust-bors
Copy link
Copy Markdown
Contributor

rust-bors Bot commented Apr 29, 2026

☔ The latest upstream changes (presumably #155953) made this pull request unmergeable. Please resolve the merge conflicts.

@inq
Copy link
Copy Markdown
Contributor

inq commented May 4, 2026

I ran into the same hotspot while profiling
ReShell (rust-lang/trait-system-refactor-initiative#254) and want to throw
in a thought on the staleness concern.

Would it address the issue if stalled_on were initialized eagerly at
register_predicate_obligation — set to the inference vars present in
the surface predicate — and only refined later during
try_evaluate_obligations? That keeps the sub-root index a strict
superset over true dependence, so the filter never misses obligations
even before the next try_evaluate_obligations call. The implicit
ordering requirement goes away by construction.

The added cost is a small visit at registration time, which seems cheap
relative to the perf wins this PR shows.

Sharing as a possible angle to unblock — happy to defer on whether it's
worth doing as a precursor here or separately.

@inq
Copy link
Copy Markdown
Contributor

inq commented May 5, 2026

Went and built the eager-init variant on top of this PR to make the discussion more concrete. Sharing two ways to look at it — feel free to ignore if you'd rather keep this PR focused.

Branch: inq:stalled-on-eager-init-fresh — squashed cherry-pick of your two commits onto current main, plus one commit adding eager init.

Just the eager-init commit if you want to cherry-pick directly onto your branch: 98327760e20

Inline diff (the only delta vs this PR):

 type PendingObligations<'tcx> = ThinVec<(
     PredicateObligation<'tcx>,
     Option<GoalStalledOn<TyCtxt<'tcx>>>,
+    // Initial sub_roots, captured at register time. Used when stalled_on
+    // is None (i.e. before the obligation has been evaluated).
+    ThinVec<TyVid>,
 )>;

+fn collect_initial_sub_roots<'tcx>(
+    infcx: &InferCtxt<'tcx>,
+    predicate: ty::Predicate<'tcx>,
+) -> ThinVec<TyVid> {
+    // walk predicate; for each ty::TyVar, push sub_unification_table_root_var
+}

 impl<'tcx> ObligationStorage<'tcx> {
-    fn register(&mut self, obligation: ..., stalled_on: ...) {
-        self.pending.push((obligation, stalled_on));
+    fn register(&mut self, infcx: &InferCtxt<'tcx>, obligation: ..., stalled_on: ...) {
+        let initial_sub_roots = collect_initial_sub_roots(infcx, obligation.predicate);
+        self.pending.push((obligation, stalled_on, initial_sub_roots));
     }

     fn clone_pending_potentially_referencing_sub_root(&self, vid: TyVid) -> ... {
-        .filter(|(_, stalled_on)| {
-            if let Some(stalled_on) = stalled_on {
-                stalled_on.sub_roots.iter().any(|&r| r == vid)
-            } else {
-                true
-            }
+        .filter(|(_, stalled_on, initial_sub_roots)| {
+            let sub_roots = if let Some(stalled_on) = stalled_on {
+                &stalled_on.sub_roots[..]
+            } else {
+                &initial_sub_roots[..]
+            };
+            sub_roots.iter().any(|&r| r == vid)
         })
     }
 }

Local test status (stage1, aarch64-apple-darwin):

  • tests/ui next-solver: 336/337 (1 ignored, 0 fail)
  • tests/ui traits/: 1339/1341 (2 ignored, 0 fail)
  • No regression on lcnr's hang case from the cycles thread (same E0275 as baseline).

Caveats I'm aware of:

  • Adds a TypeVisitor walk per register call. Cheap on shape — shallow_resolve only, no canonicalization — but unmeasured.
  • Same staleness window as the existing filter for evaluated obligations: sub_root unifications between register and the filter call could still drop relevant obligations. No worse, no better.

No perf numbers from me — wg-grammar would be the natural target but I don't have the setup. Happy to fold this into your PR (you keep authorship of the original two commits + add my one on top), land it as a follow-up after this merges, or just close the loop here. Whatever's least friction for you.

@lcnr
Copy link
Copy Markdown
Contributor Author

lcnr commented May 5, 2026

I've been thinking of this PR again as you've done the change to f32 #154758 and planned to look into this anyways.

if stalled_on is None, we just always treat it as potentially referencing sub_root.

If stalled_on is yes, we don't look into sub_roots but instead recompute whether things are sub_related by walking over the stalled_vars. This means we can never accidentally miss sub_roots. especially as we can gate this behavior behind #156172

I think the rest of this PR is just good to go, would be up to take this over and open it as a new PR 🤔

@inq
Copy link
Copy Markdown
Contributor

inq commented May 5, 2026

Thank you very much! Let me take over anything if you need!

Sorry of my bad English, but can I ask my understanding?

if stalled_on is Some: ignore sub_roots -> walk stalled_vars -> re-evaluate sub_root 
else: same with this PR

plus: should I wait for merging #156172, or make a PR depend on it?

@ShoyuVanilla
Copy link
Copy Markdown
Member

plus: should I wait for merging #156172, or make a PR depend on it?

That PR only introduces a permanently unstable compiler flag, so it's probably only relevant if you're planning to add tests with something like //@ compile-flags: -Zdisable-fast-paths, to make sure that we never accidentally miss sub_roots.

@inq
Copy link
Copy Markdown
Contributor

inq commented May 5, 2026

@ShoyuVanilla Thank you very much! It's very clear!

@lcnr
Copy link
Copy Markdown
Contributor Author

lcnr commented May 5, 2026

closing in favor of #156187

@lcnr lcnr closed this May 5, 2026
@rustbot rustbot removed the S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. label May 5, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. WG-trait-system-refactor The Rustc Trait System Refactor Initiative (-Znext-solver)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants