-
Notifications
You must be signed in to change notification settings - Fork 13.4k
[Analysis] atan2: isTriviallyVectorizable; add to massv and accelerate veclibs #113637
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@llvm/pr-subscribers-llvm-analysis @llvm/pr-subscribers-llvm-transforms Author: Tex Riddell (tex3d) ChangesThis change is part of this proposal: https://discourse.llvm.org/t/rfc-all-the-math-intrinsics/78294 This returns true from isTriviallyVectorizable for llvm.atan2 intrinsic. It also adds llvm atan2 intrinsic equivalents to VecFuncs.def for massv. Part of: Implement the atan2 HLSL Function #70096. Full diff: https://github.com/llvm/llvm-project/pull/113637.diff 3 Files Affected:
diff --git a/llvm/include/llvm/Analysis/VecFuncs.def b/llvm/include/llvm/Analysis/VecFuncs.def
index c4586894e3e490..6f0ba3f9e406c4 100644
--- a/llvm/include/llvm/Analysis/VecFuncs.def
+++ b/llvm/include/llvm/Analysis/VecFuncs.def
@@ -289,7 +289,9 @@ TLI_DEFINE_VECFUNC("acosf", "__acosf4", FIXED(4), "_ZGV_LLVM_N4v")
TLI_DEFINE_VECFUNC("atan", "__atand2", FIXED(2), "_ZGV_LLVM_N2v")
TLI_DEFINE_VECFUNC("atanf", "__atanf4", FIXED(4), "_ZGV_LLVM_N4v")
TLI_DEFINE_VECFUNC("atan2", "__atan2d2", FIXED(2), "_ZGV_LLVM_N2vv")
+TLI_DEFINE_VECFUNC("llvm.atan2.f64", "__atan2d2", FIXED(2), "_ZGV_LLVM_N2vv")
TLI_DEFINE_VECFUNC("atan2f", "__atan2f4", FIXED(4), "_ZGV_LLVM_N4vv")
+TLI_DEFINE_VECFUNC("llvm.atan2.f32", "__atan2f4", FIXED(4), "_ZGV_LLVM_N4vv")
// Hyperbolic Functions
TLI_DEFINE_VECFUNC("sinh", "__sinhd2", FIXED(2), "_ZGV_LLVM_N2v")
diff --git a/llvm/lib/Analysis/VectorUtils.cpp b/llvm/lib/Analysis/VectorUtils.cpp
index 37c443011719b6..f3ffddeb40d4b4 100644
--- a/llvm/lib/Analysis/VectorUtils.cpp
+++ b/llvm/lib/Analysis/VectorUtils.cpp
@@ -69,6 +69,7 @@ bool llvm::isTriviallyVectorizable(Intrinsic::ID ID) {
case Intrinsic::asin:
case Intrinsic::acos:
case Intrinsic::atan:
+ case Intrinsic::atan2:
case Intrinsic::sin:
case Intrinsic::cos:
case Intrinsic::tan:
diff --git a/llvm/test/Transforms/LoopVectorize/PowerPC/massv-calls.ll b/llvm/test/Transforms/LoopVectorize/PowerPC/massv-calls.ll
index ba6e85e2852952..badc67866d2bad 100644
--- a/llvm/test/Transforms/LoopVectorize/PowerPC/massv-calls.ll
+++ b/llvm/test/Transforms/LoopVectorize/PowerPC/massv-calls.ll
@@ -1244,6 +1244,52 @@ for.end:
ret void
}
+define void @atan2_f64_intrinsic(ptr nocapture %varray) {
+; CHECK-LABEL: @atan2_f64_intrinsic(
+; CHECK: __atan2d2{{.*}}<2 x double>
+; CHECK: ret void
+;
+entry:
+ br label %for.body
+
+for.body:
+ %iv = phi i64 [ 0, %entry ], [ %iv.next, %for.body ]
+ %tmp = trunc i64 %iv to i32
+ %conv = sitofp i32 %tmp to double
+ %call = tail call double @llvm.atan2.f64(double %conv, double %conv)
+ %arrayidx = getelementptr inbounds double, ptr %varray, i64 %iv
+ store double %call, ptr %arrayidx, align 4
+ %iv.next = add nuw nsw i64 %iv, 1
+ %exitcond = icmp eq i64 %iv.next, 1000
+ br i1 %exitcond, label %for.end, label %for.body
+
+for.end:
+ ret void
+}
+
+define void @atan2_f32_intrinsic(ptr nocapture %varray) {
+; CHECK-LABEL: @atan2_f32_intrinsic(
+; CHECK: __atan2f4{{.*}}<4 x float>
+; CHECK: ret void
+;
+entry:
+ br label %for.body
+
+for.body:
+ %iv = phi i64 [ 0, %entry ], [ %iv.next, %for.body ]
+ %tmp = trunc i64 %iv to i32
+ %conv = sitofp i32 %tmp to float
+ %call = tail call float @llvm.atan2.f32(float %conv, float %conv)
+ %arrayidx = getelementptr inbounds float, ptr %varray, i64 %iv
+ store float %call, ptr %arrayidx, align 4
+ %iv.next = add nuw nsw i64 %iv, 1
+ %exitcond = icmp eq i64 %iv.next, 1000
+ br i1 %exitcond, label %for.end, label %for.body
+
+for.end:
+ ret void
+}
+
define void @sinh_f64(ptr nocapture %varray) {
; CHECK-LABEL: @sinh_f64(
; CHECK: __sinhd2{{.*}}<2 x double>
|
@llvm/pr-subscribers-backend-powerpc Author: Tex Riddell (tex3d) ChangesThis change is part of this proposal: https://discourse.llvm.org/t/rfc-all-the-math-intrinsics/78294 This returns true from isTriviallyVectorizable for llvm.atan2 intrinsic. It also adds llvm atan2 intrinsic equivalents to VecFuncs.def for massv. Part of: Implement the atan2 HLSL Function #70096. Full diff: https://github.com/llvm/llvm-project/pull/113637.diff 3 Files Affected:
diff --git a/llvm/include/llvm/Analysis/VecFuncs.def b/llvm/include/llvm/Analysis/VecFuncs.def
index c4586894e3e490..6f0ba3f9e406c4 100644
--- a/llvm/include/llvm/Analysis/VecFuncs.def
+++ b/llvm/include/llvm/Analysis/VecFuncs.def
@@ -289,7 +289,9 @@ TLI_DEFINE_VECFUNC("acosf", "__acosf4", FIXED(4), "_ZGV_LLVM_N4v")
TLI_DEFINE_VECFUNC("atan", "__atand2", FIXED(2), "_ZGV_LLVM_N2v")
TLI_DEFINE_VECFUNC("atanf", "__atanf4", FIXED(4), "_ZGV_LLVM_N4v")
TLI_DEFINE_VECFUNC("atan2", "__atan2d2", FIXED(2), "_ZGV_LLVM_N2vv")
+TLI_DEFINE_VECFUNC("llvm.atan2.f64", "__atan2d2", FIXED(2), "_ZGV_LLVM_N2vv")
TLI_DEFINE_VECFUNC("atan2f", "__atan2f4", FIXED(4), "_ZGV_LLVM_N4vv")
+TLI_DEFINE_VECFUNC("llvm.atan2.f32", "__atan2f4", FIXED(4), "_ZGV_LLVM_N4vv")
// Hyperbolic Functions
TLI_DEFINE_VECFUNC("sinh", "__sinhd2", FIXED(2), "_ZGV_LLVM_N2v")
diff --git a/llvm/lib/Analysis/VectorUtils.cpp b/llvm/lib/Analysis/VectorUtils.cpp
index 37c443011719b6..f3ffddeb40d4b4 100644
--- a/llvm/lib/Analysis/VectorUtils.cpp
+++ b/llvm/lib/Analysis/VectorUtils.cpp
@@ -69,6 +69,7 @@ bool llvm::isTriviallyVectorizable(Intrinsic::ID ID) {
case Intrinsic::asin:
case Intrinsic::acos:
case Intrinsic::atan:
+ case Intrinsic::atan2:
case Intrinsic::sin:
case Intrinsic::cos:
case Intrinsic::tan:
diff --git a/llvm/test/Transforms/LoopVectorize/PowerPC/massv-calls.ll b/llvm/test/Transforms/LoopVectorize/PowerPC/massv-calls.ll
index ba6e85e2852952..badc67866d2bad 100644
--- a/llvm/test/Transforms/LoopVectorize/PowerPC/massv-calls.ll
+++ b/llvm/test/Transforms/LoopVectorize/PowerPC/massv-calls.ll
@@ -1244,6 +1244,52 @@ for.end:
ret void
}
+define void @atan2_f64_intrinsic(ptr nocapture %varray) {
+; CHECK-LABEL: @atan2_f64_intrinsic(
+; CHECK: __atan2d2{{.*}}<2 x double>
+; CHECK: ret void
+;
+entry:
+ br label %for.body
+
+for.body:
+ %iv = phi i64 [ 0, %entry ], [ %iv.next, %for.body ]
+ %tmp = trunc i64 %iv to i32
+ %conv = sitofp i32 %tmp to double
+ %call = tail call double @llvm.atan2.f64(double %conv, double %conv)
+ %arrayidx = getelementptr inbounds double, ptr %varray, i64 %iv
+ store double %call, ptr %arrayidx, align 4
+ %iv.next = add nuw nsw i64 %iv, 1
+ %exitcond = icmp eq i64 %iv.next, 1000
+ br i1 %exitcond, label %for.end, label %for.body
+
+for.end:
+ ret void
+}
+
+define void @atan2_f32_intrinsic(ptr nocapture %varray) {
+; CHECK-LABEL: @atan2_f32_intrinsic(
+; CHECK: __atan2f4{{.*}}<4 x float>
+; CHECK: ret void
+;
+entry:
+ br label %for.body
+
+for.body:
+ %iv = phi i64 [ 0, %entry ], [ %iv.next, %for.body ]
+ %tmp = trunc i64 %iv to i32
+ %conv = sitofp i32 %tmp to float
+ %call = tail call float @llvm.atan2.f32(float %conv, float %conv)
+ %arrayidx = getelementptr inbounds float, ptr %varray, i64 %iv
+ store float %call, ptr %arrayidx, align 4
+ %iv.next = add nuw nsw i64 %iv, 1
+ %exitcond = icmp eq i64 %iv.next, 1000
+ br i1 %exitcond, label %for.end, label %for.body
+
+for.end:
+ ret void
+}
+
define void @sinh_f64(ptr nocapture %varray) {
; CHECK-LABEL: @sinh_f64(
; CHECK: __sinhd2{{.*}}<2 x double>
|
e6c6df3
to
72efe7b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
amd_vrs4_atan2f |
This change is part of this proposal: https://discourse.llvm.org/t/rfc-all-the-math-intrinsics/78294 This returns true from isTriviallyVectorizable for llvm.atan2 intrinsic. It also adds llvm atan2 intrinsic equivalents to VecFuncs.def for massv. Part of: Implement the atan2 HLSL Function llvm#70096.
…sis/TargetTransformInfoImpl.h` TBD: Add testing for this change.
72efe7b
to
e3b92d1
Compare
Added.
Added, but it had no test impact, so I'm looking for feedback on how this can be tested.
It turns out these example files have all the code under |
See this pr: https://github.com/llvm/llvm-project/pull/95517/files tests like this should show that the libcall name to intrinsic mapping in the check line llvm/test/Transforms/SLPVectorizer/X86/call.ll |
This still passes without the mapping for I commented out the whole block to see if any tests would fail, and found exactly one: llvm-project/llvm/test/Transforms/TailCallElim/inf-recursion.ll Lines 3 to 13 in cb98366
If isLoweredToCall returns true for fabs , this results in:
define double @fabs(double %f) {
entry:
br label %tailrecurse
tailrecurse: ; preds = %tailrecurse, %entry
br label %tailrecurse
} This doesn't exactly seem like the primary effect this code is meant for. From call graph inspection, it looks like this could impact some loop unrolling if it contains a matching call. It doesn't look like any of these were explicitly tested though. Architecturally, this is ugly, as is pointed out by the comment from 10 years ago in that code: llvm-project/llvm/include/llvm/Analysis/TargetTransformInfoImpl.h Lines 152 to 155 in cb98366
It seems like we should just have tablegen definitions with clear properties somewhere, then all these random spots in code that depend on specific properties would be handled in a uniform way. I guess I'll just merge it as-is for now. |
…e veclibs (llvm#113637) This change is part of this proposal: https://discourse.llvm.org/t/rfc-all-the-math-intrinsics/78294 - Return true for atan2 from isTriviallyVectorizable - Add atan2 to VecFuncs.def for massv and accelerate libraries. - Add atan2 to hasOptimizedCodeGen - Add atan2 support in llvm/lib/Analysis/ValueTracking.cpp llvm::getIntrinsicForCallSite and update vectorization tests - Add atan2 name check to isLoweredToCall in llvm/include/llvm/Analysis/TargetTransformInfoImpl.h - Note: there's no test coverage for these names in isLoweredToCall, except that Transforms/TailCallElim/inf-recursion.ll is impacted by the "fabs" case Thanks to @jroelofs for the atan2 accelerate veclib and associated test additions, plus the hasOptimizedCodeGen addition. Part of: Implement the atan2 HLSL Function llvm#70096.
This change is part of this proposal: https://discourse.llvm.org/t/rfc-all-the-math-intrinsics/78294
Thanks to @jroelofs for the atan2 accelerate veclib and associated test additions, plus the hasOptimizedCodeGen addition.
Part of: Implement the atan2 HLSL Function #70096.