[Branch Hints] Add a utility to compare with metadata, and use it in merging opt passes #7733

kripken · 2025-07-16T22:38:16Z

Several passes merge code, and it is dangerous to merge identical code
with different branch hints (or other code annotations): the different
parts of code may intentionally have different hints, e.g. an if that checks
a condition and then runs the same code, knowing that if the condition
was true the hints go one way, and vice versa.

tlively · 2025-07-17T04:37:36Z

test/lit/passes/code-folding_branch-hints.wast

+ ;; CHECK-NEXT:  (f32.const 0)
+ ;; CHECK-NEXT: )
+ (func $different (param $x i32) (param $y i32) (result f32)
+  ;; The branch hints differ, so we do not optimize.


It might be better to merge and drop the hint in this case. If the only benefit of branch hints is that cold code can be placed far away, then surely just deduplicating the cold code into some warmer code is at least as good.

But the issue may be cold code inside each of the arms. Imagine that under some condition, matching$x, the first arm runs 30% faster (because the internal if is almost never entered, depending on $y), and that in the reverse condition, the second arm runs 30% faster (because the internal if is almost always entered). Merging the arms and removing the branch hint would make us 30% slower.

But wouldn't the hypothetical speedups of 30% be achieved via better code cache locality when the cold blocks are split out? Then merging the code would achieve the same speedups.

Maybe the simple testcase here is misleading - imagine there are loops in each of the arms here, so we spend a lot of time in one arm, and the hints matter.

The problem is that which blocks are cold depends on what data we are running on:

In more detail, imagine we have a function calc_mostly_0 which processes some huge buffer of data, and assumes most of the code is 0. We would want a hint there that whenever it checks if an item is 0, that is likely.

And imagine we also have calc_mostly_1 which processes similar data, but assumes most of the data is 1. The hint there would say items of 0 are unlikely, the reverse of before.

Now assume those two functions are identical in all but the branch hints. Then they get compiled to very different machine code: in the first, blocks that handle 1 are cold and moved out, while in the latter, blocks that handle 0 are cold and moved out.

And, imagine we can tell what the data properties are, at runtime: perhaps we know that data from certain sources is mostly 0s, and from others, mostly 1s, so we might do

if (probably_mostly_0) { calc_mostly_0(); } else { calc_mostly_1(); }

Merging code here, or even just removing the branch hints, would be slower.

This is a bit unlikely to happen in reality - likely there would be other differences somewhere - but it is realistic, I think, to collect separate PGO profiles and compile multiple versions of a function, if you have a way to know which profile is more relevant. Like imagine a physics engine can be tuned for many small objects or a few big and complicated ones.

Right, if you merge the functions in that situation, then the merged function has more co-located code than either of the unmerged functions would have. But there would be less code overall, so potentially less code cache pressure and better performance, although that would depend on the placement of the functions and the behavior of the cache. I just don't think there's as clear a benefit to keeping the functions separate as you are arguing, and allowing optimization hints to inhibit optimizations seems like a priority inversion to me.

There is more code overall without merging, but the hot-loop parts would remain hot loops regardless of how much code is around them. We have to assume some effect of locality like that, I think? Otherwise, if I follow your logic, we'd need to consider erasing hints after inlining (as the extra code might render them ineffective), but nothing I have read suggests that.

Also, not merging code here is necessary to preserve the fuzzing invariant that optimizations do not cause bad branch hints: If we merge, we must pick one hint and it might be wrong (causing large slowdowns). Or, if we merge and erase the hints, we may also cause a slowdown.

(I do think maybe when optimizing for size, that we may want to always merge, though.)

After talking offline, it sounds like we just had different expectations about how useful the hints would be and therefore how hard we should try to preserve them. I hadn't been considering register allocation effects in particular. This approach sgtm, especially if it turns out that LLVM makes similar decisions.

Hmm, you raised a very good point with following LLVM. I had trouble finding a clear answer in the source, but here is a testcase with a select:

define i32 @simple_hoist(i32 %n, i1 %cond, i1 %cond2) { entry: br i1 %cond, label %then.block, label %else.block then.block: %temp1 = select i1 %cond2, i32 10, i32 -1, !prof !1 %val_then = add i32 %n, %temp1 br label %merge.block else.block: %temp2 = select i1 %cond2, i32 10, i32 -1, !prof !2 %val_else = add i32 %n, %temp2 br label %merge.block merge.block: %res = phi i32 [ %val_then, %then.block ], [ %val_else, %else.block ] ret i32 %res } !1 = !{!"branch_weights", i32 2000, i32 1} !2 = !{!"branch_weights", i32 1, i32 2000}

and here is one with an if:

define i32 @simple_hoist(i32 %n, i1 %cond, i1 %cond2) { entry: br i1 %cond, label %then.block, label %else.block then.block: %val_then = add i32 %n, 20 %cmp.then = icmp sgt i32 %val_then, 100 br i1 %cmp.then, label %ret.a, label %ret.b, !prof !1 else.block: %val_else = add i32 %n, 20 %cmp.else = icmp sgt i32 %val_else, 100 br i1 %cmp.else, label %ret.a, label %ret.b, !prof !2 ret.a: %res1 = phi i32 [ 42, %then.block ], [ 1337, %else.block ] ret i32 %res1 ret.b: %res2 = phi i32 [ 43, %then.block ], [ 1338, %else.block ] ret i32 %res2 } !1 = !{!"branch_weights", i32 2000, i32 1} !2 = !{!"branch_weights", i32 1, i32 2000}

In both cases LLVM is perfectly happy to pick one of the two arbitrarily. It makes no effort to preserve the hint.

I am surprised by this but I think we can trust LLVM on it, so let's not land this, and I'll also revert #7715 . I will also need to figure out some other mechanism for how to fuzz this.

Is the plan to drop the hint on merging? If so, I wouldn't expect that to need special fuzzer support. Or is the plan to pick one arbitrarily and keep its hint?

I'm considering a few possible plans for the fuzzer, either skipping relevant passes, or a flag.

Yes, I think picking one hint arbitrarily, as LLVM does, is good enough - it's simplest and presumably LLVM found it ok to do (maybe even optimal).

Our comparisons are only used in optimization atm, and we decided not to consider metadata when merging functions, as that is what LLVM does: it will fold code together without carefully checking if profiling info matches, and without even trying to fix up that profiling info later. For more on LLVM, see discussion in #7733, but basically LLVM will just merge two instructions with different branch_weights and keep the first of the two, no matter what.

work

d650547

kripken requested a review from tlively July 16, 2025 22:38

tlively reviewed Jul 17, 2025

View reviewed changes

clarify in testcase, actually use

84f7949

tlively approved these changes Jul 17, 2025

View reviewed changes

kripken closed this Jul 18, 2025

kripken deleted the eq.meta branch July 18, 2025 00:31

kripken mentioned this pull request Jul 21, 2025

Revert part of #7715: Ignore metadata in function comparisons #7737

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Branch Hints] Add a utility to compare with metadata, and use it in merging opt passes #7733

[Branch Hints] Add a utility to compare with metadata, and use it in merging opt passes #7733

Uh oh!

kripken commented Jul 16, 2025

Uh oh!

tlively Jul 17, 2025

Uh oh!

kripken Jul 17, 2025

Uh oh!

tlively Jul 17, 2025

Uh oh!

kripken Jul 17, 2025

Uh oh!

tlively Jul 17, 2025

Uh oh!

kripken Jul 17, 2025

Uh oh!

tlively Jul 17, 2025

Uh oh!

kripken Jul 18, 2025

Uh oh!

tlively Jul 18, 2025

Uh oh!

kripken Jul 21, 2025

Uh oh!

Uh oh!

[Branch Hints] Add a utility to compare with metadata, and use it in merging opt passes #7733

[Branch Hints] Add a utility to compare with metadata, and use it in merging opt passes #7733

Uh oh!

Conversation

kripken commented Jul 16, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!