-
Notifications
You must be signed in to change notification settings - Fork 5.1k
Update fgMorphHWIntrinsic to more closely follow fgMorphSmpOp #116892
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch |
64c9908
to
faae0bb
Compare
faae0bb
to
c375121
Compare
c375121
to
f07bbf9
Compare
c06c09c
to
c699047
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR enhances fgMorphHWIntrinsic
by introducing an optional “effective operation” mode in GetOperForHWIntrinsicId
—bringing its behavior in line with fgMorphSmpOp
—and splits the morphing logic into required and optional paths in the compiler.
- Changed the
GetOperForHWIntrinsicId
signature ingentree.h
and implemented thegetEffectiveOp
logic ingentree.cpp
. - Added
fgMorphHWIntrinsicRequired
andfgMorphHWIntrinsicOptional
declarations incompiler.h
. - Removed the old single-parameter overload for
GetOperForHWIntrinsicId
.
Reviewed Changes
Copilot reviewed 3 out of 4 changed files in this pull request and generated no comments.
File | Description |
---|---|
src/coreclr/jit/gentree.h | Updated GetOperForHWIntrinsicId signature and removed the old overload |
src/coreclr/jit/gentree.cpp | Added getEffectiveOp handling for GT_SUB and GT_XOR patterns |
src/coreclr/jit/compiler.h | Declared new fgMorphHWIntrinsicRequired and fgMorphHWIntrinsicOptional methods |
Comments suppressed due to low confidence (2)
… is done correctly
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
// 1. (v1 ^ AllBitsSet) to VectorXxx.OnesComplement(v1) | ||
// 2. (v1 ^ -0.0) to VectorXxx.Negate(v1); for floating-point | ||
case GT_XOR: | ||
// 1. (-v1) - (-v2) to v2 - v1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
comment not related to this PR in particular. But I wish we had infrastructure to pattern match and replace this in morph....currently we are just adding this manually and making it more bug prone.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall looks good, but since there is quite a bit of code churn, would recommend running some jitstress pipelines. I have already kicked off antigen/fuzzlyn
This comment was marked as outdated.
This comment was marked as outdated.
looks like there are several failures in Antigen and Fuzzlyn |
Yep, there was an edge case missed where we didn't create a needed I've removed the fallback, replacing it with an assert instead, and updated the relevant path to correctly reuse the existing conversion intrinsic. |
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
/azp run Antigen, Fuzzlyn |
Azure Pipelines successfully started running 2 pipeline(s). |
/azp run runtime-coreclr jitstress-isas-x86, runtime-coreclr jitstress, runtime-coreclr jitstressregs |
Azure Pipelines successfully started running 3 pipeline(s). |
/azp run Antigen, Fuzzlyn |
Azure Pipelines successfully started running 2 pipeline(s). |
Logged kunalspathak/Antigen#12 for some of the Antigen failures. The other failures are the Arm64 Sve issue being looked at and are unrelated. |
This updates
fgMorphHWIntrinsic
to have many of the same arithmetic and logical transforms asfgMorphSmpOp
and more closely matches the conditions under which those transforms are done by the latter. This results in a some decent diffs due to other transformations that are unlocked.As part of this, I also ensured that
fgMorphSmpOp
was lighting up certain optimizations for floating-point where it is safe to do. For example, the following transforms are IEEE 754 compliant (many of these are only done when one input is a constant as the transform isn't beneficial otherwise):a - b
becominga + (-b)
(-a) - (-b)
becomingb - a
-(a * b)
becominga * (-b)
(-a) * b
becominga * (-b)
-(a / b)
becominga / (-b)
(-a) / b
becominga / (-b)
Most importantly it covers handling
~(a cmp b)
and transforming it toa cmp* b
; that is, it reverses the comparison. This allows significantly smaller codegen around various branches and allows other optimizations to better light up.The few number of codegen regressions are in T0 code where lowering ends up placing a value in a local for reuse and this causes a spill/reload of the value on the stack.