Skip to content

JIT: save pgo data in inline context, use it for call optimization #116241

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Jun 3, 2025

Conversation

AndyAyersMS
Copy link
Member

Fix #116223

Save profile data (schema, schema size, and data) in the inline context.
Always store the inline context in GT_CALL nodes. Use this to find
the correct PGO data to consult when optimizing a call (in particular,
for late cast expansions).

Also pick up the dumping improvements from #116231.

AndyAyersMS and others added 2 commits June 2, 2025 19:42
Fix dotnet#116223

Save profile data (schema, schema size, and data) in the inline context.
Always store the inline context in GT_CALL nodes. Use this to find
the correct PGO data to consult when optimizing a call (in particular,
for late cast expansions).
@Copilot Copilot AI review requested due to automatic review settings June 3, 2025 02:44
@github-actions github-actions bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Jun 3, 2025
Copy link
Contributor

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR enhances PGO-driven inlining by storing per-method profile data in the inline context and leveraging it during call optimization (especially for late cast expansions), and also refines dump logging.

  • Introduce PgoInfo and store PGO data inside InlineContext
  • Propagate PGO info into calls and update pickGDV to consume it with a new verboseLogging flag
  • Enable dumping of PGO-driven type guesses in gtDispTree for cast helpers

Reviewed Changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
src/coreclr/jit/inline.h Define PgoInfo and add Get/Set/HasPgoInfo to InlineContext
src/coreclr/jit/inline.cpp Implement PgoInfo constructors
src/coreclr/jit/importercalls.cpp Use InlineContext PGO data in pickGDV and add verboseLogging
src/coreclr/jit/helperexpansion.cpp Switch cast-expansion arrays from MAX_CAST_GUESSES to MAX_GDV_TYPE_CHECKS
src/coreclr/jit/gentree.h Make gtInlineContext unconditionally available
src/coreclr/jit/gentree.cpp Assign gtInlineContext on new/clone calls and dump PGO data for cast helpers
src/coreclr/jit/compiler.h Add verboseLogging parameter (default true) to pickGDV signature
src/coreclr/jit/compiler.cpp Call SetPgoInfo on compInlineContext during initialization
Comments suppressed due to low confidence (2)

src/coreclr/jit/helperexpansion.cpp:2160

  • The call to pickGDV is missing the new verboseLogging argument, leading to a signature mismatch. Add a bool verboseLogging parameter (e.g., true or false) to match the updated signature.
comp->pickGDV(castHelper, castHelper->gtCastHelperILOffset, false, likelyClasses, nullptr, &likelyClassCount, likelyLikelihoods);

src/coreclr/jit/compiler.cpp:2557

  • Add tests to verify that SetPgoInfo correctly stores PGO schema, count, and data, and that HasPgoInfo/GetPgoInfo return the expected values in inlined contexts.
compInlineContext->SetPgoInfo(PgoInfo(this));

@@ -870,6 +885,21 @@ class InlineContext
}
#endif

const PgoInfo& GetPgoInfo()
Copy link
Preview

Copilot AI Jun 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] Consider adding a brief doc comment explaining what GetPgoInfo, SetPgoInfo, and HasPgoInfo do, and how PgoInfo is intended to be used within inlining.

Copilot uses AI. Check for mistakes.

@@ -7529,7 +7529,8 @@ class Compiler
CORINFO_CLASS_HANDLE* classGuesses,
CORINFO_METHOD_HANDLE* methodGuesses,
int* candidatesCount,
unsigned* likelihoods);
unsigned* likelihoods,
bool verboseLogging = true);
Copy link
Preview

Copilot AI Jun 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] The pickGDV parameter list is growing; consider grouping related flags (like verboseLogging) and PGO inputs into a struct to simplify calls and reduce maintenance risk.

Copilot uses AI. Check for mistakes.

@AndyAyersMS
Copy link
Member Author

@dotnet/jit-contrib FYI
@EgorBo PTAL

Makes GT_CALL nodes bigger. Should fix a number of inlining-related perf issues, like #113913. Reasonble number of diffs.

For one benchmark there I have locally

Method Job Toolchain input Mean Error StdDev Median Min Max Ratio RatioSD Allocated Alloc Ratio
LastWithPredicate_FirstElementMatches Job-CLLPQM No inline w/EH IList 5.086 ns 0.1165 ns 0.1090 ns 5.024 ns 4.987 ns 5.269 ns 1.00 0.03 - NA
LastWithPredicate_FirstElementMatches Job-COLTNK Inline w/EH IList 6.779 ns 0.0913 ns 0.0854 ns 6.807 ns 6.635 ns 6.904 ns 1.33 0.03 - NA
LastWithPredicate_FirstElementMatches Job-BEDPPC This PR IList 3.291 ns 0.0617 ns 0.0577 ns 3.289 ns 3.203 ns 3.419 ns 0.65 0.02 - NA

@EgorBo
Copy link
Member

EgorBo commented Jun 3, 2025

@EgorBot -amd -arm

using System.Runtime.CompilerServices;
using BenchmarkDotNet.Attributes;
using BenchmarkDotNet.Running;

BenchmarkSwitcher.FromAssembly(typeof(Benchmarks).Assembly).Run(args);

public class Benchmarks
{
    int class1 = 0;
    int class2 = 0;

    BaseClass b1 = new DerivedClass1();
    BaseClass b2 = new DerivedClass2();

    [Benchmark]
    public void Bench()
    {
        for (int i = 0; i < 10000; i++)
            Problem(i);
    }

    [MethodImpl(MethodImplOptions.NoInlining)]
    public void Problem(int i)
    {
        BaseClass b = i % 3 == 0 ? b1 : b2;
        if (b is DerivedClass1)
            class1++;
        else if (IsDerivedClass2(b))
            class2++;
    }

    public bool IsDerivedClass2(BaseClass b) => b is DerivedClass2;

    public class BaseClass { }
    public class DerivedClass1 : BaseClass { }
    public class DerivedClass2 : BaseClass { }
}

@EgorBo
Copy link
Member

EgorBo commented Jun 3, 2025

hm.. @AndyAyersMS any idea why your initial repro snippet regresses with your PR? EgorBot/runtime-utils#376

It looks like the casts no longer expand with PGO at all there

@AndyAyersMS
Copy link
Member Author

hm.. @AndyAyersMS any idea why your initial repro snippet regresses with your PR? EgorBot/runtime-utils#376

It looks like the casts no longer expand with PGO at all there

Odd. It was working ok for me locally. Let me double-check.

@AndyAyersMS
Copy link
Member Author

AndyAyersMS commented Jun 3, 2025

hm.. @AndyAyersMS any idea why your initial repro snippet regresses with your PR? EgorBot/runtime-utils#376
It looks like the casts no longer expand with PGO at all there

Odd. It was working ok for me locally. Let me double-check.

It works with checked builds -- there is a misplaced #endif preventing profile capture.

@AndyAyersMS
Copy link
Member Author

@EgorBot -amd -arm

using System.Runtime.CompilerServices;
using BenchmarkDotNet.Attributes;
using BenchmarkDotNet.Running;

BenchmarkSwitcher.FromAssembly(typeof(Benchmarks).Assembly).Run(args);

public class Benchmarks
{
    int class1 = 0;
    int class2 = 0;

    BaseClass b1 = new DerivedClass1();
    BaseClass b2 = new DerivedClass2();

    [Benchmark]
    public void Bench()
    {
        for (int i = 0; i < 10000; i++)
            Problem(i);
    }

    [MethodImpl(MethodImplOptions.NoInlining)]
    public void Problem(int i)
    {
        BaseClass b = i % 3 == 0 ? b1 : b2;
        if (b is DerivedClass1)
            class1++;
        else if (IsDerivedClass2(b))
            class2++;
    }

    public bool IsDerivedClass2(BaseClass b) => b is DerivedClass2;

    public class BaseClass { }
    public class DerivedClass1 : BaseClass { }
    public class DerivedClass2 : BaseClass { }
}

@AndyAyersMS
Copy link
Member Author

Results look better now

BenchmarkDotNet v0.14.0, Ubuntu 24.04.2 LTS (Noble Numbat)
Unknown processor
.NET SDK 10.0.100-preview.6.25303.101
[Host] : .NET 9.0.5 (9.0.525.21509), Arm64 RyuJIT AdvSIMD
Job-GQRFWP : .NET 10.0.0 (42.42.42.42424), Arm64 RyuJIT AdvSIMD
Job-KPSXLT : .NET 10.0.0 (42.42.42.42424), Arm64 RyuJIT AdvSIMD

Method Toolchain Mean Error Ratio
Bench Main 36.91 us 0.018 us 1.00
Bench PR 32.48 us 0.010 us 0.88

@AndyAyersMS
Copy link
Member Author

/ba-g unrelated failure with no helix log

@AndyAyersMS AndyAyersMS merged commit 30082a4 into dotnet:main Jun 3, 2025
107 of 109 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

JIT: we lose profile info for inlinee casts
2 participants