-
Notifications
You must be signed in to change notification settings - Fork 5.1k
Delete FEATURE_DOUBLE_ALIGNMENT_HINT for GC heap allocations #115985
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR removes the FEATURE_DOUBLE_ALIGNMENT_HINT code related to optimizing GC heap allocations for arrays of doubles, since modern x86 hardware no longer suffers from significant performance penalties due to misaligned memory accesses.
- Removed FEATURE_DOUBLE_ALIGNMENT_HINT blocks in helper functions for GC memory allocations.
- Removed configuration settings and related macros associated with the double alignment hint.
- Updated comments to reflect the new behavior for double alignment.
Reviewed Changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated no comments.
Show a summary per file
File | Description |
---|---|
src/coreclr/vm/jitinterface.cpp | Removed alignment hint adjustments in GC allocation helpers. |
src/coreclr/vm/i386/jitinterfacex86.cpp | Removed threshold adjustments for double arrays for large object heap. |
src/coreclr/vm/gchelpers.cpp | Deleted FEATURE_DOUBLE_ALIGNMENT_HINT conditions in allocation routines. |
src/coreclr/vm/eeconfig.h | Removed configuration getters and members for the double alignment feature. |
src/coreclr/vm/eeconfig.cpp | Deleted initialization and configuration code for double alignment. |
src/coreclr/inc/switches.h | Updated comments to reflect changes in double alignment usage. |
src/coreclr/inc/clrconfigvalues.h | Removed configuration value for double array to large object heap mapping. |
Comments suppressed due to low confidence (1)
src/coreclr/inc/switches.h:154
- The updated comment now focuses on double alignment for structs on the stack, which differs from the previous emphasis on arrays of doubles. Please verify that this comment accurately reflects the intended behavior after removing FEATURE_DOUBLE_ALIGNMENT_HINT.
// Prefer double alignment for structs with doubles on the stack.
See discussion at EgorBot/runtime-utils#356 (comment)
|
cc @filipnavara |
@EgorBot -windows_intel -use32bit using BenchmarkDotNet.Attributes;
public class Bench
{
[Params(16, 32, 64, 128)]
public int Size;
[Benchmark]
public double[] AllocDoubleArray() => new double[Size];
[Benchmark]
public double[] AllocDoubleArrayFixed() => new double[32];
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
@@ -151,8 +151,7 @@ | |||
#define FEATURE_64BIT_ALIGNMENT | |||
#endif | |||
|
|||
// Prefer double alignment for structs and arrays with doubles. Put arrays of doubles more agressively | |||
// into large object heap for performance because large object heap is 8 byte aligned | |||
// Prefer double alignment for structs with doubles on the stack. | |||
#if !defined(FEATURE_64BIT_ALIGNMENT) && !defined(HOST_64BIT) | |||
#define FEATURE_DOUBLE_ALIGNMENT_HINT |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The other usage is in
runtime/src/coreclr/vm/classlayoutinfo.cpp
Line 494 in cbd7849
if (ShouldAlign8(numR8Fields, numInstanceFields)) |
which can be inlined:
if (numR8Fields * 2 > numInstanceFields && numR8Fields >= 2)
(only usage of ShouldAlign8)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it can be inlined but it won't make the code easier to understand. This heuristic is duplicated in AOT compilers in method with the same name:
private static bool ShouldAlign8(int dwR8Fields, int dwTotalFields) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My point was FEATURE_DOUBLE_ALIGNMENT_HINT
can be replaced by defined(FEATURE_64BIT_ALIGNMENT) || defined(TARGET_32BIT)
since the remaining two usages have nothing to do with double
alignment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The remaining usages have to do with double alignment on stack. I have updated the comment on FEATURE_DOUBLE_ALIGNMENT_HINT
to reflected that and mentioned in the comment above that we can evaluate deleting that separately.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
We may have outerloop tests that test this such as: runtime/src/tests/JIT/Methodical/doublearray/dblarray1.cs Lines 4 to 11 in 1548409
Searching the codebase for 101284 may find those places because we disable these for native AOT and that's the issue number for it. |
/azp run runtime-coreclr outerloop |
Azure Pipelines will not run the associated pipelines, because the pull request was updated after the run command was issued. Review the pull request again and issue a new run command. |
/azp run runtime-coreclr outerloop |
Azure Pipelines successfully started running 1 pipeline(s). |
Isn't this still required for correctness on ARM32? |
ARM32 [and WASM] uses a different define - |
Ah okay.
Wasn't there an issue a few months ago where an Intel engineer confirmed that unaligned atomics are very slow on X86 for 64bit |
This PR specifically affects the |
Yes, that's correct. However, |
/ba-g infrastructure timeouts and known issues |
Thanks! |
This optimization is no longer relevant on current x86 hardware. The performance penalty for misaligned memory access is negligible compared to what it used to be.
Fixes #101284