[Analysis] atan2: isTriviallyVectorizable; add to massv and accelerate veclibs #113637

tex3d · 2024-10-25T02:56:59Z

This change is part of this proposal: https://discourse.llvm.org/t/rfc-all-the-math-intrinsics/78294

Return true for atan2 from isTriviallyVectorizable
Add atan2 to VecFuncs.def for massv and accelerate libraries.
Add atan2 to hasOptimizedCodeGen
Add atan2 support in llvm/lib/Analysis/ValueTracking.cpp llvm::getIntrinsicForCallSite and update vectorization tests
Add atan2 name check to isLoweredToCall in llvm/include/llvm/Analysis/TargetTransformInfoImpl.h

Thanks to @jroelofs for the atan2 accelerate veclib and associated test additions, plus the hasOptimizedCodeGen addition.

Part of: Implement the atan2 HLSL Function #70096.

llvmbot · 2024-10-25T02:57:33Z

@llvm/pr-subscribers-llvm-analysis

@llvm/pr-subscribers-llvm-transforms

Author: Tex Riddell (tex3d)

Changes

This change is part of this proposal: https://discourse.llvm.org/t/rfc-all-the-math-intrinsics/78294

This returns true from isTriviallyVectorizable for llvm.atan2 intrinsic. It also adds llvm atan2 intrinsic equivalents to VecFuncs.def for massv.

Part of: Implement the atan2 HLSL Function #70096.

Full diff: https://github.com/llvm/llvm-project/pull/113637.diff

3 Files Affected:

(modified) llvm/include/llvm/Analysis/VecFuncs.def (+2)
(modified) llvm/lib/Analysis/VectorUtils.cpp (+1)
(modified) llvm/test/Transforms/LoopVectorize/PowerPC/massv-calls.ll (+46)

diff --git a/llvm/include/llvm/Analysis/VecFuncs.def b/llvm/include/llvm/Analysis/VecFuncs.def
index c4586894e3e490..6f0ba3f9e406c4 100644
--- a/llvm/include/llvm/Analysis/VecFuncs.def
+++ b/llvm/include/llvm/Analysis/VecFuncs.def
@@ -289,7 +289,9 @@ TLI_DEFINE_VECFUNC("acosf", "__acosf4", FIXED(4), "_ZGV_LLVM_N4v")
 TLI_DEFINE_VECFUNC("atan", "__atand2", FIXED(2), "_ZGV_LLVM_N2v")
 TLI_DEFINE_VECFUNC("atanf", "__atanf4", FIXED(4), "_ZGV_LLVM_N4v")
 TLI_DEFINE_VECFUNC("atan2", "__atan2d2", FIXED(2), "_ZGV_LLVM_N2vv")
+TLI_DEFINE_VECFUNC("llvm.atan2.f64", "__atan2d2", FIXED(2), "_ZGV_LLVM_N2vv")
 TLI_DEFINE_VECFUNC("atan2f", "__atan2f4", FIXED(4), "_ZGV_LLVM_N4vv")
+TLI_DEFINE_VECFUNC("llvm.atan2.f32", "__atan2f4", FIXED(4), "_ZGV_LLVM_N4vv")
 
 // Hyperbolic Functions
 TLI_DEFINE_VECFUNC("sinh", "__sinhd2", FIXED(2), "_ZGV_LLVM_N2v")
diff --git a/llvm/lib/Analysis/VectorUtils.cpp b/llvm/lib/Analysis/VectorUtils.cpp
index 37c443011719b6..f3ffddeb40d4b4 100644
--- a/llvm/lib/Analysis/VectorUtils.cpp
+++ b/llvm/lib/Analysis/VectorUtils.cpp
@@ -69,6 +69,7 @@ bool llvm::isTriviallyVectorizable(Intrinsic::ID ID) {
   case Intrinsic::asin:
   case Intrinsic::acos:
   case Intrinsic::atan:
+  case Intrinsic::atan2:
   case Intrinsic::sin:
   case Intrinsic::cos:
   case Intrinsic::tan:
diff --git a/llvm/test/Transforms/LoopVectorize/PowerPC/massv-calls.ll b/llvm/test/Transforms/LoopVectorize/PowerPC/massv-calls.ll
index ba6e85e2852952..badc67866d2bad 100644
--- a/llvm/test/Transforms/LoopVectorize/PowerPC/massv-calls.ll
+++ b/llvm/test/Transforms/LoopVectorize/PowerPC/massv-calls.ll
@@ -1244,6 +1244,52 @@ for.end:
   ret void
 }
 
+define void @atan2_f64_intrinsic(ptr nocapture %varray) {
+; CHECK-LABEL: @atan2_f64_intrinsic(
+; CHECK: __atan2d2{{.*}}<2 x double>
+; CHECK: ret void
+;
+entry:
+  br label %for.body
+
+for.body:
+  %iv = phi i64 [ 0, %entry ], [ %iv.next, %for.body ]
+  %tmp = trunc i64 %iv to i32
+  %conv = sitofp i32 %tmp to double
+  %call = tail call double @llvm.atan2.f64(double %conv, double %conv)
+  %arrayidx = getelementptr inbounds double, ptr %varray, i64 %iv
+  store double %call, ptr %arrayidx, align 4
+  %iv.next = add nuw nsw i64 %iv, 1
+  %exitcond = icmp eq i64 %iv.next, 1000
+  br i1 %exitcond, label %for.end, label %for.body
+
+for.end:
+  ret void
+}
+
+define void @atan2_f32_intrinsic(ptr nocapture %varray) {
+; CHECK-LABEL: @atan2_f32_intrinsic(
+; CHECK: __atan2f4{{.*}}<4 x float>
+; CHECK: ret void
+;
+entry:
+  br label %for.body
+
+for.body:
+  %iv = phi i64 [ 0, %entry ], [ %iv.next, %for.body ]
+  %tmp = trunc i64 %iv to i32
+  %conv = sitofp i32 %tmp to float
+  %call = tail call float @llvm.atan2.f32(float %conv, float %conv)
+  %arrayidx = getelementptr inbounds float, ptr %varray, i64 %iv
+  store float %call, ptr %arrayidx, align 4
+  %iv.next = add nuw nsw i64 %iv, 1
+  %exitcond = icmp eq i64 %iv.next, 1000
+  br i1 %exitcond, label %for.end, label %for.body
+
+for.end:
+  ret void
+}
+
 define void @sinh_f64(ptr nocapture %varray) {
 ; CHECK-LABEL: @sinh_f64(
 ; CHECK: __sinhd2{{.*}}<2 x double>

llvmbot · 2024-10-25T02:57:33Z

@llvm/pr-subscribers-backend-powerpc

Author: Tex Riddell (tex3d)

Changes

This change is part of this proposal: https://discourse.llvm.org/t/rfc-all-the-math-intrinsics/78294

This returns true from isTriviallyVectorizable for llvm.atan2 intrinsic. It also adds llvm atan2 intrinsic equivalents to VecFuncs.def for massv.

Part of: Implement the atan2 HLSL Function #70096.

Full diff: https://github.com/llvm/llvm-project/pull/113637.diff

3 Files Affected:

(modified) llvm/include/llvm/Analysis/VecFuncs.def (+2)
(modified) llvm/lib/Analysis/VectorUtils.cpp (+1)
(modified) llvm/test/Transforms/LoopVectorize/PowerPC/massv-calls.ll (+46)

diff --git a/llvm/include/llvm/Analysis/VecFuncs.def b/llvm/include/llvm/Analysis/VecFuncs.def
index c4586894e3e490..6f0ba3f9e406c4 100644
--- a/llvm/include/llvm/Analysis/VecFuncs.def
+++ b/llvm/include/llvm/Analysis/VecFuncs.def
@@ -289,7 +289,9 @@ TLI_DEFINE_VECFUNC("acosf", "__acosf4", FIXED(4), "_ZGV_LLVM_N4v")
 TLI_DEFINE_VECFUNC("atan", "__atand2", FIXED(2), "_ZGV_LLVM_N2v")
 TLI_DEFINE_VECFUNC("atanf", "__atanf4", FIXED(4), "_ZGV_LLVM_N4v")
 TLI_DEFINE_VECFUNC("atan2", "__atan2d2", FIXED(2), "_ZGV_LLVM_N2vv")
+TLI_DEFINE_VECFUNC("llvm.atan2.f64", "__atan2d2", FIXED(2), "_ZGV_LLVM_N2vv")
 TLI_DEFINE_VECFUNC("atan2f", "__atan2f4", FIXED(4), "_ZGV_LLVM_N4vv")
+TLI_DEFINE_VECFUNC("llvm.atan2.f32", "__atan2f4", FIXED(4), "_ZGV_LLVM_N4vv")
 
 // Hyperbolic Functions
 TLI_DEFINE_VECFUNC("sinh", "__sinhd2", FIXED(2), "_ZGV_LLVM_N2v")
diff --git a/llvm/lib/Analysis/VectorUtils.cpp b/llvm/lib/Analysis/VectorUtils.cpp
index 37c443011719b6..f3ffddeb40d4b4 100644
--- a/llvm/lib/Analysis/VectorUtils.cpp
+++ b/llvm/lib/Analysis/VectorUtils.cpp
@@ -69,6 +69,7 @@ bool llvm::isTriviallyVectorizable(Intrinsic::ID ID) {
   case Intrinsic::asin:
   case Intrinsic::acos:
   case Intrinsic::atan:
+  case Intrinsic::atan2:
   case Intrinsic::sin:
   case Intrinsic::cos:
   case Intrinsic::tan:
diff --git a/llvm/test/Transforms/LoopVectorize/PowerPC/massv-calls.ll b/llvm/test/Transforms/LoopVectorize/PowerPC/massv-calls.ll
index ba6e85e2852952..badc67866d2bad 100644
--- a/llvm/test/Transforms/LoopVectorize/PowerPC/massv-calls.ll
+++ b/llvm/test/Transforms/LoopVectorize/PowerPC/massv-calls.ll
@@ -1244,6 +1244,52 @@ for.end:
   ret void
 }
 
+define void @atan2_f64_intrinsic(ptr nocapture %varray) {
+; CHECK-LABEL: @atan2_f64_intrinsic(
+; CHECK: __atan2d2{{.*}}<2 x double>
+; CHECK: ret void
+;
+entry:
+  br label %for.body
+
+for.body:
+  %iv = phi i64 [ 0, %entry ], [ %iv.next, %for.body ]
+  %tmp = trunc i64 %iv to i32
+  %conv = sitofp i32 %tmp to double
+  %call = tail call double @llvm.atan2.f64(double %conv, double %conv)
+  %arrayidx = getelementptr inbounds double, ptr %varray, i64 %iv
+  store double %call, ptr %arrayidx, align 4
+  %iv.next = add nuw nsw i64 %iv, 1
+  %exitcond = icmp eq i64 %iv.next, 1000
+  br i1 %exitcond, label %for.end, label %for.body
+
+for.end:
+  ret void
+}
+
+define void @atan2_f32_intrinsic(ptr nocapture %varray) {
+; CHECK-LABEL: @atan2_f32_intrinsic(
+; CHECK: __atan2f4{{.*}}<4 x float>
+; CHECK: ret void
+;
+entry:
+  br label %for.body
+
+for.body:
+  %iv = phi i64 [ 0, %entry ], [ %iv.next, %for.body ]
+  %tmp = trunc i64 %iv to i32
+  %conv = sitofp i32 %tmp to float
+  %call = tail call float @llvm.atan2.f32(float %conv, float %conv)
+  %arrayidx = getelementptr inbounds float, ptr %varray, i64 %iv
+  store float %call, ptr %arrayidx, align 4
+  %iv.next = add nuw nsw i64 %iv, 1
+  %exitcond = icmp eq i64 %iv.next, 1000
+  br i1 %exitcond, label %for.end, label %for.body
+
+for.end:
+  ret void
+}
+
 define void @sinh_f64(ptr nocapture %varray) {
 ; CHECK-LABEL: @sinh_f64(
 ; CHECK: __sinhd2{{.*}}<2 x double>

jroelofs

LGTM

farzonl · 2024-11-06T09:44:55Z

Add libfunc mapping to intrinsic in llvm/lib/Analysis/ValueTracking.cpp llvm::getIntrinsicForCallSite
Add atan2 name check in llvm/include/llvm/Analysis/TargetTransformInfoImpl.h isLoweredToCall
I noticed there were some examples that use a amdlibm vector form below. Should we do a follow on PR that adds atan2 amdlibm and updates llvm/test/Transforms/LoopVectorize/X86/amdlibm-calls.ll?

amd_vrs4_atan2f
amd_vrs8_atan2f
amd_vrd2_atan2
amd_vrd4_atan2

This change is part of this proposal: https://discourse.llvm.org/t/rfc-all-the-math-intrinsics/78294 This returns true from isTriviallyVectorizable for llvm.atan2 intrinsic. It also adds llvm atan2 intrinsic equivalents to VecFuncs.def for massv. Part of: Implement the atan2 HLSL Function llvm#70096.

…n tests

…sis/TargetTransformInfoImpl.h` TBD: Add testing for this change.

tex3d · 2024-11-07T01:32:55Z

@farzonl

Add libfunc mapping to intrinsic in llvm/lib/Analysis/ValueTracking.cpp llvm::getIntrinsicForCallSite

Added.

Add atan2 name check in llvm/include/llvm/Analysis/TargetTransformInfoImpl.h isLoweredToCall

Added, but it had no test impact, so I'm looking for feedback on how this can be tested.

I noticed there were some examples that use a amdlibm vector form below. Should we do a follow on PR that adds atan2 amdlibm and updates llvm/test/Transforms/LoopVectorize/X86/amdlibm-calls.ll?

It turns out these example files have all the code under #if 0, and there doesn't appear to be any declarations of these vector operations in the header. I'll leave this for a later update when the operations are added to amdlibm.

farzonl · 2024-11-07T01:43:45Z

See this pr: https://github.com/llvm/llvm-project/pull/95517/files

tests like this should show that the libcall name to intrinsic mapping in the check line

llvm/test/Transforms/SLPVectorizer/X86/call.ll

tex3d · 2024-11-08T23:59:36Z

@farzonl

See this pr: https://github.com/llvm/llvm-project/pull/95517/files
tests like this should show that the libcall name to intrinsic mapping in the check line
llvm/test/Transforms/SLPVectorizer/X86/call.ll

This still passes without the mapping for tan. There are other places where libs calls are mapped, which is why I'm unsure of what actually depends on this code.

I commented out the whole block to see if any tests would fail, and found exactly one:

llvm-project/llvm/test/Transforms/TailCallElim/inf-recursion.ll

Lines 3 to 13 in cb98366

    
           ; Don't turn this into an infinite loop, this is probably the implementation 
        
           ; of fabs and we expect the codegen to lower fabs. 
        
           ; CHECK: @fabs(double %f) 
        
           ; CHECK: call 
        
           ; CHECK: ret 
        
           define double @fabs(double %f) { 
        
           entry: 
        
                   %tmp2 = call double @fabs( double %f )          ; <double> [#uses=1] 
        
                   ret double %tmp2 
        
           }

If isLoweredToCall returns true for fabs, this results in:

define double @fabs(double %f) {
entry:
 br label %tailrecurse
tailrecurse: ; preds = %tailrecurse, %entry
 br label %tailrecurse
}

This doesn't exactly seem like the primary effect this code is meant for. From call graph inspection, it looks like this could impact some loop unrolling if it contains a matching call. It doesn't look like any of these were explicitly tested though.

Architecturally, this is ugly, as is pointed out by the comment from 10 years ago in that code:

llvm-project/llvm/include/llvm/Analysis/TargetTransformInfoImpl.h

Lines 152 to 155 in cb98366

    
           // FIXME: These should almost certainly not be handled here, and instead 
        
           // handled with the help of TLI or the target itself. This was largely 
        
           // ported from existing analysis heuristics here so that such refactorings 
        
           // can take place in the future.

It seems like we should just have tablegen definitions with clear properties somewhere, then all these random spots in code that depend on specific properties would be handled in a uniform way.

I guess I'll just merge it as-is for now.

@jroelofs

…e veclibs (llvm#113637) This change is part of this proposal: https://discourse.llvm.org/t/rfc-all-the-math-intrinsics/78294 - Return true for atan2 from isTriviallyVectorizable - Add atan2 to VecFuncs.def for massv and accelerate libraries. - Add atan2 to hasOptimizedCodeGen - Add atan2 support in llvm/lib/Analysis/ValueTracking.cpp llvm::getIntrinsicForCallSite and update vectorization tests - Add atan2 name check to isLoweredToCall in llvm/include/llvm/Analysis/TargetTransformInfoImpl.h - Note: there's no test coverage for these names in isLoweredToCall, except that Transforms/TailCallElim/inf-recursion.ll is impacted by the "fabs" case Thanks to @jroelofs for the atan2 accelerate veclib and associated test additions, plus the hasOptimizedCodeGen addition. Part of: Implement the atan2 HLSL Function llvm#70096.

llvmbot added backend:PowerPC llvm:analysis llvm:transforms labels Oct 25, 2024

tex3d mentioned this pull request Oct 25, 2024

[llvm] Improve TLI for Darwin libsystem_m functions #109479

Open

tex3d force-pushed the atan2-vectorize branch from e6c6df3 to 72efe7b Compare October 28, 2024 23:06

jroelofs approved these changes Oct 29, 2024

View reviewed changes

tex3d changed the title ~~[Analysis] isTriviallyVectorizable add vectorization for atan2 intrinsic~~ [Analysis] atan2: isTriviallyVectorizable; add to massv and accelerate veclibs Oct 29, 2024

farzonl approved these changes Nov 6, 2024

View reviewed changes

tex3d added 4 commits November 6, 2024 12:19

Add atan2 to accelerate vector library and hasOptimizedCodeGen

146292e

[Analysis] Add atan2 support in ValueTracking and update vectorizatio…

65d6652

…n tests

Add atan2 name check to isLoweredToCall in `llvm/include/llvm/Analy…

e3b92d1

…sis/TargetTransformInfoImpl.h` TBD: Add testing for this change.

tex3d force-pushed the atan2-vectorize branch from 72efe7b to e3b92d1 Compare November 7, 2024 01:28

tex3d requested a review from nikic as a code owner November 7, 2024 01:28

tex3d merged commit 818d715 into llvm:main Nov 9, 2024
8 checks passed

tex3d deleted the atan2-vectorize branch November 9, 2024 00:08

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Analysis] atan2: isTriviallyVectorizable; add to massv and accelerate veclibs #113637

[Analysis] atan2: isTriviallyVectorizable; add to massv and accelerate veclibs #113637

tex3d commented Oct 25, 2024 •

edited

Loading

llvmbot commented Oct 25, 2024 •

edited

Loading

llvmbot commented Oct 25, 2024

jroelofs left a comment

farzonl commented Nov 6, 2024 •

edited

Loading

tex3d commented Nov 7, 2024

farzonl commented Nov 7, 2024

tex3d commented Nov 8, 2024

[Analysis] atan2: isTriviallyVectorizable; add to massv and accelerate veclibs #113637

[Analysis] atan2: isTriviallyVectorizable; add to massv and accelerate veclibs #113637

Conversation

tex3d commented Oct 25, 2024 • edited Loading

llvmbot commented Oct 25, 2024 • edited Loading

llvmbot commented Oct 25, 2024

jroelofs left a comment

Choose a reason for hiding this comment

farzonl commented Nov 6, 2024 • edited Loading

tex3d commented Nov 7, 2024

farzonl commented Nov 7, 2024

tex3d commented Nov 8, 2024

tex3d commented Oct 25, 2024 •

edited

Loading

llvmbot commented Oct 25, 2024 •

edited

Loading

farzonl commented Nov 6, 2024 •

edited

Loading