-
Notifications
You must be signed in to change notification settings - Fork 5.1k
JIT: save pgo data in inline context, use it for call optimization #116241
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Fix dotnet#116223 Save profile data (schema, schema size, and data) in the inline context. Always store the inline context in GT_CALL nodes. Use this to find the correct PGO data to consult when optimizing a call (in particular, for late cast expansions).
Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR enhances PGO-driven inlining by storing per-method profile data in the inline context and leveraging it during call optimization (especially for late cast expansions), and also refines dump logging.
- Introduce
PgoInfo
and store PGO data insideInlineContext
- Propagate PGO info into calls and update
pickGDV
to consume it with a newverboseLogging
flag - Enable dumping of PGO-driven type guesses in
gtDispTree
for cast helpers
Reviewed Changes
Copilot reviewed 8 out of 8 changed files in this pull request and generated 2 comments.
Show a summary per file
File | Description |
---|---|
src/coreclr/jit/inline.h | Define PgoInfo and add Get/Set/HasPgoInfo to InlineContext |
src/coreclr/jit/inline.cpp | Implement PgoInfo constructors |
src/coreclr/jit/importercalls.cpp | Use InlineContext PGO data in pickGDV and add verboseLogging |
src/coreclr/jit/helperexpansion.cpp | Switch cast-expansion arrays from MAX_CAST_GUESSES to MAX_GDV_TYPE_CHECKS |
src/coreclr/jit/gentree.h | Make gtInlineContext unconditionally available |
src/coreclr/jit/gentree.cpp | Assign gtInlineContext on new/clone calls and dump PGO data for cast helpers |
src/coreclr/jit/compiler.h | Add verboseLogging parameter (default true ) to pickGDV signature |
src/coreclr/jit/compiler.cpp | Call SetPgoInfo on compInlineContext during initialization |
Comments suppressed due to low confidence (2)
src/coreclr/jit/helperexpansion.cpp:2160
- The call to
pickGDV
is missing the newverboseLogging
argument, leading to a signature mismatch. Add abool verboseLogging
parameter (e.g.,true
orfalse
) to match the updated signature.
comp->pickGDV(castHelper, castHelper->gtCastHelperILOffset, false, likelyClasses, nullptr, &likelyClassCount, likelyLikelihoods);
src/coreclr/jit/compiler.cpp:2557
- Add tests to verify that
SetPgoInfo
correctly stores PGO schema, count, and data, and thatHasPgoInfo
/GetPgoInfo
return the expected values in inlined contexts.
compInlineContext->SetPgoInfo(PgoInfo(this));
@@ -870,6 +885,21 @@ class InlineContext | |||
} | |||
#endif | |||
|
|||
const PgoInfo& GetPgoInfo() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nitpick] Consider adding a brief doc comment explaining what GetPgoInfo
, SetPgoInfo
, and HasPgoInfo
do, and how PgoInfo
is intended to be used within inlining.
Copilot uses AI. Check for mistakes.
@@ -7529,7 +7529,8 @@ class Compiler | |||
CORINFO_CLASS_HANDLE* classGuesses, | |||
CORINFO_METHOD_HANDLE* methodGuesses, | |||
int* candidatesCount, | |||
unsigned* likelihoods); | |||
unsigned* likelihoods, | |||
bool verboseLogging = true); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nitpick] The pickGDV
parameter list is growing; consider grouping related flags (like verboseLogging
) and PGO inputs into a struct to simplify calls and reduce maintenance risk.
Copilot uses AI. Check for mistakes.
@dotnet/jit-contrib FYI Makes GT_CALL nodes bigger. Should fix a number of inlining-related perf issues, like #113913. Reasonble number of diffs. For one benchmark there I have locally
|
@EgorBot -amd -arm using System.Runtime.CompilerServices;
using BenchmarkDotNet.Attributes;
using BenchmarkDotNet.Running;
BenchmarkSwitcher.FromAssembly(typeof(Benchmarks).Assembly).Run(args);
public class Benchmarks
{
int class1 = 0;
int class2 = 0;
BaseClass b1 = new DerivedClass1();
BaseClass b2 = new DerivedClass2();
[Benchmark]
public void Bench()
{
for (int i = 0; i < 10000; i++)
Problem(i);
}
[MethodImpl(MethodImplOptions.NoInlining)]
public void Problem(int i)
{
BaseClass b = i % 3 == 0 ? b1 : b2;
if (b is DerivedClass1)
class1++;
else if (IsDerivedClass2(b))
class2++;
}
public bool IsDerivedClass2(BaseClass b) => b is DerivedClass2;
public class BaseClass { }
public class DerivedClass1 : BaseClass { }
public class DerivedClass2 : BaseClass { }
} |
hm.. @AndyAyersMS any idea why your initial repro snippet regresses with your PR? EgorBot/runtime-utils#376 It looks like the casts no longer expand with PGO at all there |
Odd. It was working ok for me locally. Let me double-check. |
It works with checked builds -- there is a misplaced #endif preventing profile capture. |
@EgorBot -amd -arm using System.Runtime.CompilerServices;
using BenchmarkDotNet.Attributes;
using BenchmarkDotNet.Running;
BenchmarkSwitcher.FromAssembly(typeof(Benchmarks).Assembly).Run(args);
public class Benchmarks
{
int class1 = 0;
int class2 = 0;
BaseClass b1 = new DerivedClass1();
BaseClass b2 = new DerivedClass2();
[Benchmark]
public void Bench()
{
for (int i = 0; i < 10000; i++)
Problem(i);
}
[MethodImpl(MethodImplOptions.NoInlining)]
public void Problem(int i)
{
BaseClass b = i % 3 == 0 ? b1 : b2;
if (b is DerivedClass1)
class1++;
else if (IsDerivedClass2(b))
class2++;
}
public bool IsDerivedClass2(BaseClass b) => b is DerivedClass2;
public class BaseClass { }
public class DerivedClass1 : BaseClass { }
public class DerivedClass2 : BaseClass { }
} |
Results look better now BenchmarkDotNet v0.14.0, Ubuntu 24.04.2 LTS (Noble Numbat)
|
/ba-g unrelated failure with no helix log |
Fix #116223
Save profile data (schema, schema size, and data) in the inline context.
Always store the inline context in GT_CALL nodes. Use this to find
the correct PGO data to consult when optimizing a call (in particular,
for late cast expansions).
Also pick up the dumping improvements from #116231.