[clang] Reland Add tanf16 builtin and support for tan constrained intrinsic #94559

farzonl · 2024-06-06T02:44:09Z

Relanding this PR now that #90503 has merged. with FTAN landing in TargetLoweringBase.cpp:L1021 There is now a llvm tan intrinsic 32\64\128 Expand case for all llvm backends.

In LLVM, the llvm.experimental.constrained.cos and llvm.experimental.constrained.sin intrinsics are used for performing cosine and sine calculations with additional constraints on floating-point operations. This behavior is expected for all floating-point math intrinsics. This change adds these constraints for the tan intrinsic.

Builtins.td - replace TanF128 with F16F128MathTemplate
CGBuiltin.cpp - map existing tan builtins to tan and constrained_tan intrinsic
ConstrainedOps.def map tan and constrained_tan to an ISDOpcode.

resolves #91421

llvmbot · 2024-06-06T02:44:42Z

@llvm/pr-subscribers-backend-powerpc
@llvm/pr-subscribers-backend-arm
@llvm/pr-subscribers-backend-aarch64
@llvm/pr-subscribers-backend-x86

@llvm/pr-subscribers-clang

Author: Farzon Lotfi (farzonl)

Changes

Relanding this PR now that #90503 has merged. with FTAN landing in TargetLoweringBase.cpp:L1021 There is now a llvm tan intrinsic 32\64\128 Expand case for all llvm backends.

In LLVM, the llvm.experimental.constrained.cos and llvm.experimental.constrained.sin intrinsics are used for performing cosine and sine calculations with additional constraints on floating-point operations. This behavior is expected for all floating-point math intrinsics. This change adds these constraints for the tan intrinsic.

Builtins.td - replace TanF128 with F16F128MathTemplate
CGBuiltin.cpp - map existing tan builtins to tan and constrained_tan intrinsic
ConstrainedOps.def map tan and constrained_tan to an ISDOpcode.
ISDOpcodes.h - define tan and strict tan opcodes

resolves #91421

Full diff: https://github.com/llvm/llvm-project/pull/94559.diff

11 Files Affected:

(modified) clang/include/clang/Basic/Builtins.td (+3-3)
(modified) clang/lib/CodeGen/CGBuiltin.cpp (+12)
(modified) clang/test/CodeGen/X86/math-builtins.c (+4-4)
(modified) clang/test/CodeGen/constrained-math-builtins.c (+13)
(modified) clang/test/CodeGen/math-libcalls.c (+6-6)
(modified) clang/test/CodeGenOpenCL/builtins-f16.cl (+3)
(modified) llvm/docs/LangRef.rst (+36)
(modified) llvm/include/llvm/IR/ConstrainedOps.def (+1)
(modified) llvm/include/llvm/IR/Intrinsics.td (+4)
(modified) llvm/test/Assembler/fp-intrinsics-attr.ll (+8)
(modified) llvm/test/Feature/fp-intrinsics.ll (+11)

diff --git a/clang/include/clang/Basic/Builtins.td b/clang/include/clang/Basic/Builtins.td
index 11982af3fa609..7bef5fd7ad40f 100644
--- a/clang/include/clang/Basic/Builtins.td
+++ b/clang/include/clang/Basic/Builtins.td
@@ -482,11 +482,11 @@ def SqrtF16F128 : Builtin, F16F128MathTemplate {
   let Prototype = "T(T)";
 }
 
-def TanF128 : Builtin {
-  let Spellings = ["__builtin_tanf128"];
+def TanF16F128 : Builtin, F16F128MathTemplate {
+  let Spellings = ["__builtin_tan"];
   let Attributes = [FunctionWithBuiltinPrefix, NoThrow,
                     ConstIgnoringErrnoAndExceptions];
-  let Prototype = "__float128(__float128)";
+  let Prototype = "T(T)";
 }
 
 def TanhF128 : Builtin {
diff --git a/clang/lib/CodeGen/CGBuiltin.cpp b/clang/lib/CodeGen/CGBuiltin.cpp
index 37d0c478e0330..a76ce82830d8f 100644
--- a/clang/lib/CodeGen/CGBuiltin.cpp
+++ b/clang/lib/CodeGen/CGBuiltin.cpp
@@ -2923,6 +2923,18 @@ RValue CodeGenFunction::EmitBuiltinExpr(const GlobalDecl GD, unsigned BuiltinID,
       SetSqrtFPAccuracy(Call);
       return RValue::get(Call);
     }
+
+    case Builtin::BItan:
+    case Builtin::BItanf:
+    case Builtin::BItanl:
+    case Builtin::BI__builtin_tan:
+    case Builtin::BI__builtin_tanf:
+    case Builtin::BI__builtin_tanf16:
+    case Builtin::BI__builtin_tanl:
+    case Builtin::BI__builtin_tanf128:
+      return RValue::get(emitUnaryMaybeConstrainedFPBuiltin(
+          *this, E, Intrinsic::tan, Intrinsic::experimental_constrained_tan));
+
     case Builtin::BItrunc:
     case Builtin::BItruncf:
     case Builtin::BItruncl:
diff --git a/clang/test/CodeGen/X86/math-builtins.c b/clang/test/CodeGen/X86/math-builtins.c
index 093239b448260..1e0f129b98610 100644
--- a/clang/test/CodeGen/X86/math-builtins.c
+++ b/clang/test/CodeGen/X86/math-builtins.c
@@ -674,10 +674,10 @@ __builtin_sqrt(f);       __builtin_sqrtf(f);      __builtin_sqrtl(f); __builtin_
 
 __builtin_tan(f);        __builtin_tanf(f);       __builtin_tanl(f); __builtin_tanf128(f);
 
-// NO__ERRNO: declare double @tan(double noundef) [[READNONE]]
-// NO__ERRNO: declare float @tanf(float noundef) [[READNONE]]
-// NO__ERRNO: declare x86_fp80 @tanl(x86_fp80 noundef) [[READNONE]]
-// NO__ERRNO: declare fp128 @tanf128(fp128 noundef) [[READNONE]]
+// NO__ERRNO: declare double @llvm.tan.f64(double) [[READNONE_INTRINSIC]]
+// NO__ERRNO: declare float @llvm.tan.f32(float) [[READNONE_INTRINSIC]]
+// NO__ERRNO: declare x86_fp80 @llvm.tan.f80(x86_fp80) [[READNONE_INTRINSIC]]
+// NO__ERRNO: declare fp128 @llvm.tan.f128(fp128) [[READNONE_INTRINSIC]]
 // HAS_ERRNO: declare double @tan(double noundef) [[NOT_READNONE]]
 // HAS_ERRNO: declare float @tanf(float noundef) [[NOT_READNONE]]
 // HAS_ERRNO: declare x86_fp80 @tanl(x86_fp80 noundef) [[NOT_READNONE]]
diff --git a/clang/test/CodeGen/constrained-math-builtins.c b/clang/test/CodeGen/constrained-math-builtins.c
index 2de832dd2b6ca..6cc3a10a1e794 100644
--- a/clang/test/CodeGen/constrained-math-builtins.c
+++ b/clang/test/CodeGen/constrained-math-builtins.c
@@ -183,6 +183,14 @@ void foo(double *d, float f, float *fp, long double *l, int *i, const char *c, _
 // CHECK: call x86_fp80 @llvm.experimental.constrained.sqrt.f80(x86_fp80 %{{.*}}, metadata !"round.tonearest", metadata !"fpexcept.strict")
 // CHECK: call fp128 @llvm.experimental.constrained.sqrt.f128(fp128 %{{.*}}, metadata !"round.tonearest", metadata !"fpexcept.strict")
 
+  __builtin_tan(f);        __builtin_tanf(f);       __builtin_tanl(f); __builtin_tanf128(f);
+
+// CHECK: call double @llvm.experimental.constrained.tan.f64(double %{{.*}}, metadata !"round.tonearest", metadata !"fpexcept.strict")
+// CHECK: call float @llvm.experimental.constrained.tan.f32(float %{{.*}}, metadata !"round.tonearest", metadata !"fpexcept.strict")
+// CHECK: call x86_fp80 @llvm.experimental.constrained.tan.f80(x86_fp80 %{{.*}}, metadata !"round.tonearest", metadata !"fpexcept.strict")
+// CHECK: call fp128 @llvm.experimental.constrained.tan.f128(fp128 %{{.*}}, metadata !"round.tonearest", metadata !"fpexcept.strict")
+
+
   __builtin_trunc(f);      __builtin_truncf(f);     __builtin_truncl(f); __builtin_truncf128(f);
 
 // CHECK: call double @llvm.experimental.constrained.trunc.f64(double %{{.*}}, metadata !"fpexcept.strict")
@@ -315,6 +323,11 @@ void foo(double *d, float f, float *fp, long double *l, int *i, const char *c, _
 // CHECK: declare x86_fp80 @llvm.experimental.constrained.sqrt.f80(x86_fp80, metadata, metadata)
 // CHECK: declare fp128 @llvm.experimental.constrained.sqrt.f128(fp128, metadata, metadata)
 
+// CHECK: declare double @llvm.experimental.constrained.tan.f64(double, metadata, metadata)
+// CHECK: declare float @llvm.experimental.constrained.tan.f32(float, metadata, metadata)
+// CHECK: declare x86_fp80 @llvm.experimental.constrained.tan.f80(x86_fp80, metadata, metadata)
+// CHECK: declare fp128 @llvm.experimental.constrained.tan.f128(fp128, metadata, metadata)
+
 // CHECK: declare double @llvm.experimental.constrained.trunc.f64(double, metadata)
 // CHECK: declare float @llvm.experimental.constrained.trunc.f32(float, metadata)
 // CHECK: declare x86_fp80 @llvm.experimental.constrained.trunc.f80(x86_fp80, metadata)
diff --git a/clang/test/CodeGen/math-libcalls.c b/clang/test/CodeGen/math-libcalls.c
index 29c312ba0ecac..a249182692762 100644
--- a/clang/test/CodeGen/math-libcalls.c
+++ b/clang/test/CodeGen/math-libcalls.c
@@ -662,15 +662,15 @@ void foo(double *d, float f, float *fp, long double *l, int *i, const char *c) {
 
   tan(f);        tanf(f);       tanl(f);
 
-// NO__ERRNO: declare double @tan(double noundef) [[READNONE]]
-// NO__ERRNO: declare float @tanf(float noundef) [[READNONE]]
-// NO__ERRNO: declare x86_fp80 @tanl(x86_fp80 noundef) [[READNONE]]
+// NO__ERRNO: declare double @llvm.tan.f64(double) [[READNONE_INTRINSIC]]
+// NO__ERRNO: declare float @llvm.tan.f32(float) [[READNONE_INTRINSIC]]
+// NO__ERRNO: declare x86_fp80 @llvm.tan.f80(x86_fp80) [[READNONE_INTRINSIC]]
 // HAS_ERRNO: declare double @tan(double noundef) [[NOT_READNONE]]
 // HAS_ERRNO: declare float @tanf(float noundef) [[NOT_READNONE]]
 // HAS_ERRNO: declare x86_fp80 @tanl(x86_fp80 noundef) [[NOT_READNONE]]
-// HAS_MAYTRAP: declare double @tan(double noundef) [[NOT_READNONE]]
-// HAS_MAYTRAP: declare float @tanf(float noundef) [[NOT_READNONE]]
-// HAS_MAYTRAP: declare x86_fp80 @tanl(x86_fp80 noundef) [[NOT_READNONE]]
+// HAS_MAYTRAP: declare double @llvm.experimental.constrained.tan.f64(
+// HAS_MAYTRAP: declare float @llvm.experimental.constrained.tan.f32(
+// HAS_MAYTRAP: declare x86_fp80 @llvm.experimental.constrained.tan.f80(
 
   tanh(f);       tanhf(f);      tanhl(f);
 
diff --git a/clang/test/CodeGenOpenCL/builtins-f16.cl b/clang/test/CodeGenOpenCL/builtins-f16.cl
index adf7cdde154f5..d7bffdad5c548 100644
--- a/clang/test/CodeGenOpenCL/builtins-f16.cl
+++ b/clang/test/CodeGenOpenCL/builtins-f16.cl
@@ -66,6 +66,9 @@ void test_half_builtins(half h0, half h1, half h2, int i0) {
   // CHECK: call half @llvm.sqrt.f16(half %h0)
   res = __builtin_sqrtf16(h0);
 
+  // CHECK: call half @llvm.tan.f16(half %h0)
+  res = __builtin_tanf16(h0);
+
   // CHECK: call half @llvm.trunc.f16(half %h0)
   res = __builtin_truncf16(h0);
 
diff --git a/llvm/docs/LangRef.rst b/llvm/docs/LangRef.rst
index 9d7ade8eb523b..08a41170d9b9b 100644
--- a/llvm/docs/LangRef.rst
+++ b/llvm/docs/LangRef.rst
@@ -26231,6 +26231,42 @@ same values as the libm ``cos`` functions would, and handles error
 conditions in the same way.
 
 
+'``llvm.experimental.constrained.tan``' Intrinsic
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Syntax:
+"""""""
+
+::
+
+      declare <type>
+      @llvm.experimental.constrained.tan(<type> <op1>,
+                                         metadata <rounding mode>,
+                                         metadata <exception behavior>)
+
+Overview:
+"""""""""
+
+The '``llvm.experimental.constrained.tan``' intrinsic returns the tangent of the
+first operand.
+
+Arguments:
+""""""""""
+
+The first argument and the return type are floating-point numbers of the same
+type.
+
+The second and third arguments specify the rounding mode and exception
+behavior as described above.
+
+Semantics:
+""""""""""
+
+This function returns the tangent of the specified operand, returning the
+same values as the libm ``tan`` functions would, and handles error
+conditions in the same way.
+
+
 '``llvm.experimental.constrained.exp``' Intrinsic
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 
diff --git a/llvm/include/llvm/IR/ConstrainedOps.def b/llvm/include/llvm/IR/ConstrainedOps.def
index 41aa44de957f9..a7b37c5cb204d 100644
--- a/llvm/include/llvm/IR/ConstrainedOps.def
+++ b/llvm/include/llvm/IR/ConstrainedOps.def
@@ -95,6 +95,7 @@ DAG_FUNCTION(round,           1, 0, experimental_constrained_round,      FROUND)
 DAG_FUNCTION(roundeven,       1, 0, experimental_constrained_roundeven,  FROUNDEVEN)
 DAG_FUNCTION(sin,             1, 1, experimental_constrained_sin,        FSIN)
 DAG_FUNCTION(sqrt,            1, 1, experimental_constrained_sqrt,       FSQRT)
+DAG_FUNCTION(tan,             1, 1, experimental_constrained_tan,        FTAN)
 DAG_FUNCTION(trunc,           1, 0, experimental_constrained_trunc,      FTRUNC)
 
 // This is definition for fmuladd intrinsic function, that is converted into
diff --git a/llvm/include/llvm/IR/Intrinsics.td b/llvm/include/llvm/IR/Intrinsics.td
index 107442623ab7b..4c506a6ace23e 100644
--- a/llvm/include/llvm/IR/Intrinsics.td
+++ b/llvm/include/llvm/IR/Intrinsics.td
@@ -1218,6 +1218,10 @@ let IntrProperties = [IntrInaccessibleMemOnly, IntrWillReturn, IntrStrictFP] in
                                                     [ LLVMMatchType<0>,
                                                       llvm_metadata_ty,
                                                       llvm_metadata_ty ]>;
+  def int_experimental_constrained_tan  : DefaultAttrsIntrinsic<[ llvm_anyfloat_ty ],
+                                                    [ LLVMMatchType<0>,
+                                                      llvm_metadata_ty,
+                                                      llvm_metadata_ty ]>;
   def int_experimental_constrained_pow  : DefaultAttrsIntrinsic<[ llvm_anyfloat_ty ],
                                                     [ LLVMMatchType<0>,
                                                       LLVMMatchType<0>,
diff --git a/llvm/test/Assembler/fp-intrinsics-attr.ll b/llvm/test/Assembler/fp-intrinsics-attr.ll
index 6546d1a275c99..613630e1a2b4d 100644
--- a/llvm/test/Assembler/fp-intrinsics-attr.ll
+++ b/llvm/test/Assembler/fp-intrinsics-attr.ll
@@ -85,6 +85,11 @@ define void @func(double %a, double %b, double %c, i32 %i) strictfp {
                                                metadata !"round.dynamic",
                                                metadata !"fpexcept.strict")
 
+  %tan = call double @llvm.experimental.constrained.tan.f64(
+                                               double %a,
+                                               metadata !"round.dynamic",
+                                               metadata !"fpexcept.strict")
+
   %pow = call double @llvm.experimental.constrained.pow.f64(
                                                double %a, double %b,
                                                metadata !"round.dynamic",
@@ -244,6 +249,9 @@ declare double @llvm.experimental.constrained.sin.f64(double, metadata, metadata
 declare double @llvm.experimental.constrained.cos.f64(double, metadata, metadata)
 ; CHECK: @llvm.experimental.constrained.cos.f64({{.*}}) #[[ATTR1]]
 
+declare double @llvm.experimental.constrained.tan.f64(double, metadata, metadata)
+; CHECK: @llvm.experimental.constrained.tan.f64({{.*}}) #[[ATTR1]]
+
 declare double @llvm.experimental.constrained.pow.f64(double, double, metadata, metadata)
 ; CHECK: @llvm.experimental.constrained.pow.f64({{.*}}) #[[ATTR1]]
 
diff --git a/llvm/test/Feature/fp-intrinsics.ll b/llvm/test/Feature/fp-intrinsics.ll
index b92408a1bf1cd..7759813dc2e11 100644
--- a/llvm/test/Feature/fp-intrinsics.ll
+++ b/llvm/test/Feature/fp-intrinsics.ll
@@ -151,6 +151,17 @@ entry:
   ret double %result
 }
 
+; Verify that tan(42.0) isn't simplified when the rounding mode is unknown.
+; CHECK-LABEL: ftan
+; CHECK: call double @llvm.experimental.constrained.tan
+define double @ftan() #0 {
+entry:
+  %result = call double @llvm.experimental.constrained.tan.f64(double 42.0,
+                                               metadata !"round.dynamic",
+                                               metadata !"fpexcept.strict") #0
+  ret double %result
+}
+
 ; Verify that exp(42.0) isn't simplified when the rounding mode is unknown.
 ; CHECK-LABEL: f10
 ; CHECK: call double @llvm.experimental.constrained.exp

llvmbot · 2024-06-06T02:44:42Z

@llvm/pr-subscribers-llvm-ir

Author: Farzon Lotfi (farzonl)

Changes

Relanding this PR now that #90503 has merged. with FTAN landing in TargetLoweringBase.cpp:L1021 There is now a llvm tan intrinsic 32\64\128 Expand case for all llvm backends.

In LLVM, the llvm.experimental.constrained.cos and llvm.experimental.constrained.sin intrinsics are used for performing cosine and sine calculations with additional constraints on floating-point operations. This behavior is expected for all floating-point math intrinsics. This change adds these constraints for the tan intrinsic.

Builtins.td - replace TanF128 with F16F128MathTemplate
CGBuiltin.cpp - map existing tan builtins to tan and constrained_tan intrinsic
ConstrainedOps.def map tan and constrained_tan to an ISDOpcode.
ISDOpcodes.h - define tan and strict tan opcodes

resolves #91421

Full diff: https://github.com/llvm/llvm-project/pull/94559.diff

11 Files Affected:

(modified) clang/include/clang/Basic/Builtins.td (+3-3)
(modified) clang/lib/CodeGen/CGBuiltin.cpp (+12)
(modified) clang/test/CodeGen/X86/math-builtins.c (+4-4)
(modified) clang/test/CodeGen/constrained-math-builtins.c (+13)
(modified) clang/test/CodeGen/math-libcalls.c (+6-6)
(modified) clang/test/CodeGenOpenCL/builtins-f16.cl (+3)
(modified) llvm/docs/LangRef.rst (+36)
(modified) llvm/include/llvm/IR/ConstrainedOps.def (+1)
(modified) llvm/include/llvm/IR/Intrinsics.td (+4)
(modified) llvm/test/Assembler/fp-intrinsics-attr.ll (+8)
(modified) llvm/test/Feature/fp-intrinsics.ll (+11)

diff --git a/clang/include/clang/Basic/Builtins.td b/clang/include/clang/Basic/Builtins.td
index 11982af3fa609..7bef5fd7ad40f 100644
--- a/clang/include/clang/Basic/Builtins.td
+++ b/clang/include/clang/Basic/Builtins.td
@@ -482,11 +482,11 @@ def SqrtF16F128 : Builtin, F16F128MathTemplate {
   let Prototype = "T(T)";
 }
 
-def TanF128 : Builtin {
-  let Spellings = ["__builtin_tanf128"];
+def TanF16F128 : Builtin, F16F128MathTemplate {
+  let Spellings = ["__builtin_tan"];
   let Attributes = [FunctionWithBuiltinPrefix, NoThrow,
                     ConstIgnoringErrnoAndExceptions];
-  let Prototype = "__float128(__float128)";
+  let Prototype = "T(T)";
 }
 
 def TanhF128 : Builtin {
diff --git a/clang/lib/CodeGen/CGBuiltin.cpp b/clang/lib/CodeGen/CGBuiltin.cpp
index 37d0c478e0330..a76ce82830d8f 100644
--- a/clang/lib/CodeGen/CGBuiltin.cpp
+++ b/clang/lib/CodeGen/CGBuiltin.cpp
@@ -2923,6 +2923,18 @@ RValue CodeGenFunction::EmitBuiltinExpr(const GlobalDecl GD, unsigned BuiltinID,
       SetSqrtFPAccuracy(Call);
       return RValue::get(Call);
     }
+
+    case Builtin::BItan:
+    case Builtin::BItanf:
+    case Builtin::BItanl:
+    case Builtin::BI__builtin_tan:
+    case Builtin::BI__builtin_tanf:
+    case Builtin::BI__builtin_tanf16:
+    case Builtin::BI__builtin_tanl:
+    case Builtin::BI__builtin_tanf128:
+      return RValue::get(emitUnaryMaybeConstrainedFPBuiltin(
+          *this, E, Intrinsic::tan, Intrinsic::experimental_constrained_tan));
+
     case Builtin::BItrunc:
     case Builtin::BItruncf:
     case Builtin::BItruncl:
diff --git a/clang/test/CodeGen/X86/math-builtins.c b/clang/test/CodeGen/X86/math-builtins.c
index 093239b448260..1e0f129b98610 100644
--- a/clang/test/CodeGen/X86/math-builtins.c
+++ b/clang/test/CodeGen/X86/math-builtins.c
@@ -674,10 +674,10 @@ __builtin_sqrt(f);       __builtin_sqrtf(f);      __builtin_sqrtl(f); __builtin_
 
 __builtin_tan(f);        __builtin_tanf(f);       __builtin_tanl(f); __builtin_tanf128(f);
 
-// NO__ERRNO: declare double @tan(double noundef) [[READNONE]]
-// NO__ERRNO: declare float @tanf(float noundef) [[READNONE]]
-// NO__ERRNO: declare x86_fp80 @tanl(x86_fp80 noundef) [[READNONE]]
-// NO__ERRNO: declare fp128 @tanf128(fp128 noundef) [[READNONE]]
+// NO__ERRNO: declare double @llvm.tan.f64(double) [[READNONE_INTRINSIC]]
+// NO__ERRNO: declare float @llvm.tan.f32(float) [[READNONE_INTRINSIC]]
+// NO__ERRNO: declare x86_fp80 @llvm.tan.f80(x86_fp80) [[READNONE_INTRINSIC]]
+// NO__ERRNO: declare fp128 @llvm.tan.f128(fp128) [[READNONE_INTRINSIC]]
 // HAS_ERRNO: declare double @tan(double noundef) [[NOT_READNONE]]
 // HAS_ERRNO: declare float @tanf(float noundef) [[NOT_READNONE]]
 // HAS_ERRNO: declare x86_fp80 @tanl(x86_fp80 noundef) [[NOT_READNONE]]
diff --git a/clang/test/CodeGen/constrained-math-builtins.c b/clang/test/CodeGen/constrained-math-builtins.c
index 2de832dd2b6ca..6cc3a10a1e794 100644
--- a/clang/test/CodeGen/constrained-math-builtins.c
+++ b/clang/test/CodeGen/constrained-math-builtins.c
@@ -183,6 +183,14 @@ void foo(double *d, float f, float *fp, long double *l, int *i, const char *c, _
 // CHECK: call x86_fp80 @llvm.experimental.constrained.sqrt.f80(x86_fp80 %{{.*}}, metadata !"round.tonearest", metadata !"fpexcept.strict")
 // CHECK: call fp128 @llvm.experimental.constrained.sqrt.f128(fp128 %{{.*}}, metadata !"round.tonearest", metadata !"fpexcept.strict")
 
+  __builtin_tan(f);        __builtin_tanf(f);       __builtin_tanl(f); __builtin_tanf128(f);
+
+// CHECK: call double @llvm.experimental.constrained.tan.f64(double %{{.*}}, metadata !"round.tonearest", metadata !"fpexcept.strict")
+// CHECK: call float @llvm.experimental.constrained.tan.f32(float %{{.*}}, metadata !"round.tonearest", metadata !"fpexcept.strict")
+// CHECK: call x86_fp80 @llvm.experimental.constrained.tan.f80(x86_fp80 %{{.*}}, metadata !"round.tonearest", metadata !"fpexcept.strict")
+// CHECK: call fp128 @llvm.experimental.constrained.tan.f128(fp128 %{{.*}}, metadata !"round.tonearest", metadata !"fpexcept.strict")
+
+
   __builtin_trunc(f);      __builtin_truncf(f);     __builtin_truncl(f); __builtin_truncf128(f);
 
 // CHECK: call double @llvm.experimental.constrained.trunc.f64(double %{{.*}}, metadata !"fpexcept.strict")
@@ -315,6 +323,11 @@ void foo(double *d, float f, float *fp, long double *l, int *i, const char *c, _
 // CHECK: declare x86_fp80 @llvm.experimental.constrained.sqrt.f80(x86_fp80, metadata, metadata)
 // CHECK: declare fp128 @llvm.experimental.constrained.sqrt.f128(fp128, metadata, metadata)
 
+// CHECK: declare double @llvm.experimental.constrained.tan.f64(double, metadata, metadata)
+// CHECK: declare float @llvm.experimental.constrained.tan.f32(float, metadata, metadata)
+// CHECK: declare x86_fp80 @llvm.experimental.constrained.tan.f80(x86_fp80, metadata, metadata)
+// CHECK: declare fp128 @llvm.experimental.constrained.tan.f128(fp128, metadata, metadata)
+
 // CHECK: declare double @llvm.experimental.constrained.trunc.f64(double, metadata)
 // CHECK: declare float @llvm.experimental.constrained.trunc.f32(float, metadata)
 // CHECK: declare x86_fp80 @llvm.experimental.constrained.trunc.f80(x86_fp80, metadata)
diff --git a/clang/test/CodeGen/math-libcalls.c b/clang/test/CodeGen/math-libcalls.c
index 29c312ba0ecac..a249182692762 100644
--- a/clang/test/CodeGen/math-libcalls.c
+++ b/clang/test/CodeGen/math-libcalls.c
@@ -662,15 +662,15 @@ void foo(double *d, float f, float *fp, long double *l, int *i, const char *c) {
 
   tan(f);        tanf(f);       tanl(f);
 
-// NO__ERRNO: declare double @tan(double noundef) [[READNONE]]
-// NO__ERRNO: declare float @tanf(float noundef) [[READNONE]]
-// NO__ERRNO: declare x86_fp80 @tanl(x86_fp80 noundef) [[READNONE]]
+// NO__ERRNO: declare double @llvm.tan.f64(double) [[READNONE_INTRINSIC]]
+// NO__ERRNO: declare float @llvm.tan.f32(float) [[READNONE_INTRINSIC]]
+// NO__ERRNO: declare x86_fp80 @llvm.tan.f80(x86_fp80) [[READNONE_INTRINSIC]]
 // HAS_ERRNO: declare double @tan(double noundef) [[NOT_READNONE]]
 // HAS_ERRNO: declare float @tanf(float noundef) [[NOT_READNONE]]
 // HAS_ERRNO: declare x86_fp80 @tanl(x86_fp80 noundef) [[NOT_READNONE]]
-// HAS_MAYTRAP: declare double @tan(double noundef) [[NOT_READNONE]]
-// HAS_MAYTRAP: declare float @tanf(float noundef) [[NOT_READNONE]]
-// HAS_MAYTRAP: declare x86_fp80 @tanl(x86_fp80 noundef) [[NOT_READNONE]]
+// HAS_MAYTRAP: declare double @llvm.experimental.constrained.tan.f64(
+// HAS_MAYTRAP: declare float @llvm.experimental.constrained.tan.f32(
+// HAS_MAYTRAP: declare x86_fp80 @llvm.experimental.constrained.tan.f80(
 
   tanh(f);       tanhf(f);      tanhl(f);
 
diff --git a/clang/test/CodeGenOpenCL/builtins-f16.cl b/clang/test/CodeGenOpenCL/builtins-f16.cl
index adf7cdde154f5..d7bffdad5c548 100644
--- a/clang/test/CodeGenOpenCL/builtins-f16.cl
+++ b/clang/test/CodeGenOpenCL/builtins-f16.cl
@@ -66,6 +66,9 @@ void test_half_builtins(half h0, half h1, half h2, int i0) {
   // CHECK: call half @llvm.sqrt.f16(half %h0)
   res = __builtin_sqrtf16(h0);
 
+  // CHECK: call half @llvm.tan.f16(half %h0)
+  res = __builtin_tanf16(h0);
+
   // CHECK: call half @llvm.trunc.f16(half %h0)
   res = __builtin_truncf16(h0);
 
diff --git a/llvm/docs/LangRef.rst b/llvm/docs/LangRef.rst
index 9d7ade8eb523b..08a41170d9b9b 100644
--- a/llvm/docs/LangRef.rst
+++ b/llvm/docs/LangRef.rst
@@ -26231,6 +26231,42 @@ same values as the libm ``cos`` functions would, and handles error
 conditions in the same way.
 
 
+'``llvm.experimental.constrained.tan``' Intrinsic
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Syntax:
+"""""""
+
+::
+
+      declare <type>
+      @llvm.experimental.constrained.tan(<type> <op1>,
+                                         metadata <rounding mode>,
+                                         metadata <exception behavior>)
+
+Overview:
+"""""""""
+
+The '``llvm.experimental.constrained.tan``' intrinsic returns the tangent of the
+first operand.
+
+Arguments:
+""""""""""
+
+The first argument and the return type are floating-point numbers of the same
+type.
+
+The second and third arguments specify the rounding mode and exception
+behavior as described above.
+
+Semantics:
+""""""""""
+
+This function returns the tangent of the specified operand, returning the
+same values as the libm ``tan`` functions would, and handles error
+conditions in the same way.
+
+
 '``llvm.experimental.constrained.exp``' Intrinsic
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 
diff --git a/llvm/include/llvm/IR/ConstrainedOps.def b/llvm/include/llvm/IR/ConstrainedOps.def
index 41aa44de957f9..a7b37c5cb204d 100644
--- a/llvm/include/llvm/IR/ConstrainedOps.def
+++ b/llvm/include/llvm/IR/ConstrainedOps.def
@@ -95,6 +95,7 @@ DAG_FUNCTION(round,           1, 0, experimental_constrained_round,      FROUND)
 DAG_FUNCTION(roundeven,       1, 0, experimental_constrained_roundeven,  FROUNDEVEN)
 DAG_FUNCTION(sin,             1, 1, experimental_constrained_sin,        FSIN)
 DAG_FUNCTION(sqrt,            1, 1, experimental_constrained_sqrt,       FSQRT)
+DAG_FUNCTION(tan,             1, 1, experimental_constrained_tan,        FTAN)
 DAG_FUNCTION(trunc,           1, 0, experimental_constrained_trunc,      FTRUNC)
 
 // This is definition for fmuladd intrinsic function, that is converted into
diff --git a/llvm/include/llvm/IR/Intrinsics.td b/llvm/include/llvm/IR/Intrinsics.td
index 107442623ab7b..4c506a6ace23e 100644
--- a/llvm/include/llvm/IR/Intrinsics.td
+++ b/llvm/include/llvm/IR/Intrinsics.td
@@ -1218,6 +1218,10 @@ let IntrProperties = [IntrInaccessibleMemOnly, IntrWillReturn, IntrStrictFP] in
                                                     [ LLVMMatchType<0>,
                                                       llvm_metadata_ty,
                                                       llvm_metadata_ty ]>;
+  def int_experimental_constrained_tan  : DefaultAttrsIntrinsic<[ llvm_anyfloat_ty ],
+                                                    [ LLVMMatchType<0>,
+                                                      llvm_metadata_ty,
+                                                      llvm_metadata_ty ]>;
   def int_experimental_constrained_pow  : DefaultAttrsIntrinsic<[ llvm_anyfloat_ty ],
                                                     [ LLVMMatchType<0>,
                                                       LLVMMatchType<0>,
diff --git a/llvm/test/Assembler/fp-intrinsics-attr.ll b/llvm/test/Assembler/fp-intrinsics-attr.ll
index 6546d1a275c99..613630e1a2b4d 100644
--- a/llvm/test/Assembler/fp-intrinsics-attr.ll
+++ b/llvm/test/Assembler/fp-intrinsics-attr.ll
@@ -85,6 +85,11 @@ define void @func(double %a, double %b, double %c, i32 %i) strictfp {
                                                metadata !"round.dynamic",
                                                metadata !"fpexcept.strict")
 
+  %tan = call double @llvm.experimental.constrained.tan.f64(
+                                               double %a,
+                                               metadata !"round.dynamic",
+                                               metadata !"fpexcept.strict")
+
   %pow = call double @llvm.experimental.constrained.pow.f64(
                                                double %a, double %b,
                                                metadata !"round.dynamic",
@@ -244,6 +249,9 @@ declare double @llvm.experimental.constrained.sin.f64(double, metadata, metadata
 declare double @llvm.experimental.constrained.cos.f64(double, metadata, metadata)
 ; CHECK: @llvm.experimental.constrained.cos.f64({{.*}}) #[[ATTR1]]
 
+declare double @llvm.experimental.constrained.tan.f64(double, metadata, metadata)
+; CHECK: @llvm.experimental.constrained.tan.f64({{.*}}) #[[ATTR1]]
+
 declare double @llvm.experimental.constrained.pow.f64(double, double, metadata, metadata)
 ; CHECK: @llvm.experimental.constrained.pow.f64({{.*}}) #[[ATTR1]]
 
diff --git a/llvm/test/Feature/fp-intrinsics.ll b/llvm/test/Feature/fp-intrinsics.ll
index b92408a1bf1cd..7759813dc2e11 100644
--- a/llvm/test/Feature/fp-intrinsics.ll
+++ b/llvm/test/Feature/fp-intrinsics.ll
@@ -151,6 +151,17 @@ entry:
   ret double %result
 }
 
+; Verify that tan(42.0) isn't simplified when the rounding mode is unknown.
+; CHECK-LABEL: ftan
+; CHECK: call double @llvm.experimental.constrained.tan
+define double @ftan() #0 {
+entry:
+  %result = call double @llvm.experimental.constrained.tan.f64(double 42.0,
+                                               metadata !"round.dynamic",
+                                               metadata !"fpexcept.strict") #0
+  ret double %result
+}
+
 ; Verify that exp(42.0) isn't simplified when the rounding mode is unknown.
 ; CHECK-LABEL: f10
 ; CHECK: call double @llvm.experimental.constrained.exp

farzonl · 2024-06-06T02:51:21Z

@ilovepi I minimised the failing issue from: https://fuchsia.googlesource.com/fuchsia/+/refs/heads/main/src/graphics/lib/compute/spinel/ext/transform_stack/transform_stack.c
to:

#include <math.h>
#include <stdio.h>

#define SPN_TRANSFORM_STACK_TAN(x_)  tanf(x_)

float
spinel_transform_stack_push_skew_x(float theta)
{
  float const tan_theta = SPN_TRANSFORM_STACK_TAN(theta);
  return tan_theta;
}

int main() {
    printf("%f\n", spinel_transform_stack_push_skew_x(1.05f));
    return 0;
}

run like so:

farzon@devbox: projects/llvm-project$ <build_dir>/bin/clang  <test_path>/test.c -lm -o  <test_out_path>/test.out
farzon@devbox: projects/llvm-project$ ./<test_out_path>/test.out 
1.743315

The intrinsic should be able to be lowered now.

bob80905 · 2024-06-07T20:16:10Z

Should a test be added when the return type is a different float type than the first arg type?

clang/lib/CodeGen/CGBuiltin.cpp

ilovepi · 2024-06-07T20:19:18Z

@ilovepi I minimised the failing issue from: https://fuchsia.googlesource.com/fuchsia/+/refs/heads/main/src/graphics/lib/compute/spinel/ext/transform_stack/transform_stack.c to:

#include <math.h>
#include <stdio.h>

#define SPN_TRANSFORM_STACK_TAN(x_)  tanf(x_)

float
spinel_transform_stack_push_skew_x(float theta)
{
  float const tan_theta = SPN_TRANSFORM_STACK_TAN(theta);
  return tan_theta;
}

int main() {
    printf("%f\n", spinel_transform_stack_push_skew_x(1.05f));
    return 0;
}

run like so:

farzon@devbox: projects/llvm-project$ <build_dir>/bin/clang  <test_path>/test.c -lm -o  <test_out_path>/test.out
farzon@devbox: projects/llvm-project$ ./<test_out_path>/test.out 
1.743315

The intrinsic should be able to be lowered now.

Thanks. This seems fine to me. I'll defer to the other reviewers on everything else, but I'm satisfied with the new test. so LGTM from my perspective.

farzonl · 2024-06-07T20:24:32Z

Should a test be added when the return type is a different float type than the first arg type?

So if our function that calls the builtin has a different return type than the input that we pass in to the builtin then that will be handled by other casting code that has its own tests so no.

Second way this could happen if tan had different return type than from input type but in this case input dictates return type for tan. So the answer again would be no.

llvm/test/CodeGen/X86/vector-constrained-fp-intrinsics.ll

llvm/test/Feature/fp-intrinsics.ll

2. add arm64 and x86 tests. 3. fix missing ISD::STRICT_FTAN in aarch64

…ystemZ

farzonl · 2024-06-10T20:09:35Z

@efriedma-quic If you have time could you take a look at this pr. It is the same as the one you review just with more tests across more targets. Problem from the first merge was because the target base changes were tied up in the x86 backend change. Since those have merged this pr should work just fine now.

efriedma-quic

LGTM

farzonl · 2024-06-11T08:35:37Z

This broke building wasi-libc, with a crash:

Stack dump:
0.	Program arguments: /builds/worker/fetches/clang/bin/clang-19 -cc1 -triple wasm32-unknown-wasi -emit-obj -disable-free -clear-ast-before-backend -disable-llvm-verifier -discard-value-names -main-file-name ctanh.c -mrelocation-model static -mthread-model single -mframe-pointer=none -ffp-contract=on -fno-rounding-math -ffp-exception-behavior=ignore -mconstructor-aliases -target-cpu generic -fvisibility=hidden -debugger-tuning=gdb -fdebug-compilation-dir=/builds/worker/fetches/wasi-sdk/src/wasi-libc -fcoverage-compilation-dir=/builds/worker/fetches/wasi-sdk/src/wasi-libc -sys-header-deps -D NDEBUG -O2 -Wall -Wextra -Wno-null-pointer-arithmetic -Wno-unused-parameter -Wno-sign-compare -Wno-unused-variable -Wno-unused-function -Wno-ignored-attributes -Wno-missing-braces -Wno-ignored-pragmas -Wno-unused-but-set-variable -Wno-unknown-warning-option -Wno-parentheses -Wno-shift-op-parentheses -Wno-bitwise-op-parentheses -Wno-logical-op-parentheses -Wno-string-plus-int -Wno-dangling-else -Wno-unknown-pragmas -ferror-limit 19 -fgnuc-version=4.2.1 -fskip-odr-check-in-gmf -vectorize-loops -vectorize-slp -x c ctanh-46a6c8.c
1.	<eof> parser at end of file
2.	Code generation
3.	Running pass 'Function Pass Manager' on module 'ctanh-46a6c8.c'.
4.	Running pass 'WebAssembly Assembly Printer' on function '@ctanh'
 #0 0x00007f2205ba35cd llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) /tmp/llvm/llvm/lib/Support/Unix/Signals.inc:723:11
 #1 0x00007f2205ba3abb PrintStackTraceSignalHandler(void*) /tmp/llvm/llvm/lib/Support/Unix/Signals.inc:798:1
 #2 0x00007f2205ba1b26 llvm::sys::RunSignalHandlers() /tmp/llvm/llvm/lib/Support/Signals.cpp:105:5
 #3 0x00007f2205ba4275 SignalHandler(int) /tmp/llvm/llvm/lib/Support/Unix/Signals.inc:413:1
 #4 0x00007f2203ff4f90 (/lib/x86_64-linux-gnu/libc.so.6+0x3bf90)
 #5 0x00007f2204043ccc (/lib/x86_64-linux-gnu/libc.so.6+0x8accc)
 #6 0x00007f2203ff4ef2 raise (/lib/x86_64-linux-gnu/libc.so.6+0x3bef2)
 #7 0x00007f2203fdf472 abort (/lib/x86_64-linux-gnu/libc.so.6+0x26472)
 #8 0x00007f2205a7f300 llvm::install_out_of_memory_new_handler() /tmp/llvm/llvm/lib/Support/ErrorHandling.cpp:194:0
 #9 0x00007f220ba5052d llvm::WebAssembly::getLibcallSignature(llvm::WebAssemblySubtarget const&, llvm::StringRef, llvm::SmallVectorImpl<llvm::wasm::ValType>&, llvm::SmallVectorImpl<llvm::wasm::ValType>&) /tmp/llvm/llvm/lib/Target/WebAssembly/WebAssemblyRuntimeLibcallSignatures.cpp:907:30
#10 0x00007f220b9c4e0f llvm::WebAssemblyAsmPrinter::getOrCreateWasmSymbol(llvm::StringRef) /tmp/llvm/llvm/lib/Target/WebAssembly/WebAssemblyAsmPrinter.cpp:0:5
#11 0x00007f220b9c4f46 llvm::WebAssemblyAsmPrinter::emitDecls(llvm::Module const&) /tmp/llvm/llvm/lib/Target/WebAssembly/WebAssemblyAsmPrinter.cpp:310:40
#12 0x00007f220b9c7179 llvm::WebAssemblyAsmPrinter::emitConstantPool() /tmp/llvm/llvm/lib/Target/WebAssembly/WebAssemblyAsmPrinter.cpp:604:3
#13 0x00007f2206ee9526 llvm::AsmPrinter::emitFunctionHeader() /tmp/llvm/llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp:966:3

This file contains the script and preprocessed source that reproduce the problem: ctanh.zip

This seems to be because the WASM backend does some non standard things setting up a libcall to type signature map. I put up a fix #95082. I typically would just revert, but the fix is so compact i think it makes more sense to merge a fix to unblock. That said i'm not a regular WASM contributor so i'm going to go back to bed and wait till i get some guidance from those folks.

…reTable (#95082) The wasm backend fetches the tan runtime lib call in `llvm/include/llvm/IR/RuntimeLibcalls.def` via `StaticLibcallNameMap()`, but ignores the runtime function because a function sinature mapping is not specified in RuntimeLibcallSignatureTable(). The fix is to specify the function signatures for float32-128. This is a fix for a build break reported on PR #94559 (comment).

…rinsic (llvm#94559) Relanding this PR now that llvm#90503 has merged. with `FTAN` landing in [TargetLoweringBase.cpp:L1021](https://github.com/llvm/llvm-project/blob/main/llvm/lib/CodeGen/TargetLoweringBase.cpp#L1020C23-L1021C63 ) There is now a llvm tan intrinsic 32\64\128 Expand case for all llvm backends. In LLVM, the `llvm.experimental.constrained.cos` and `llvm.experimental.constrained.sin` intrinsics are used for performing cosine and sine calculations with additional constraints on floating-point operations. This behavior is expected for all floating-point math intrinsics. This change adds these constraints for the `tan` intrinsic. - `Builtins.td` - replace TanF128 with F16F128MathTemplate - `CGBuiltin.cpp` - map existing tan builtins to `tan` and `constrained_tan` intrinsic - `ConstrainedOps.def` map tan and constrained_tan to an ISDOpcode. resolves llvm#91421 --------- Co-authored-by: Farzon Lotfi <[email protected]>

aeubanks · 2024-06-12T20:05:27Z

I believe this has exposed a preexisting issue in isel with tan:

$ cat /tmp/b.ll
target datalayout = "e-m:e-p:32:32-Fi8-i64:64-v128:64:128-a:0:32-n32-S64"
target triple = "thumbv7-unknown-linux-android26"

define <2 x float> @g() {
entry:
  %0 = call <2 x float> @llvm.tan.v2f32(<2 x float> zeroinitializer)
  ret <2 x float> %0
}
$ llc /tmp/b.ll -o /dev/null
LLVM ERROR: Cannot select: t3: v2f32 = ftan t15                                                                                                                                                                                                                                                                    
  t15: v2f32 = bitcast t14                                                                                                                                                                                                                                                                                         
    t14: v2i32 = ARMISD::VMOVIMM TargetConstant:i32<0>                                                                                                   
      t13: i32 = TargetConstant<0>

farzonl · 2024-06-12T22:22:07Z

@aeubanks The issue you are seeing here is because only aarch64 and x86 backends have tan intrinsic lowering support.
changing one line in your example from "thumbv7-unknown-linux-android26" to "aarch64-unknown-linux-android26" or "x86_64-unknown-linux-gnu" will show this to be the case.

This shouldn't be a regression because there never was a llvm.tan.v2f32 for ARMv7. That still needs to be added. Can you show me how you could have generated llvm.tan.v2f32 for ARMv7 from clang?

By default TargetLoweringBase.cpp:1016-1022 only supports lowering for f32, f64, and f128 scalars across all backends.

The builtins that were added (Builtin::BItanf, Builtin::BItanl, Builtin::BI__builtin_tan, Builtin::BI__builtin_tanf, Builtin::BI__builtin_tanf16, Builtin::BI__builtin_tanl, and Builtin::BI__builtin_tanf128) should only cover scalar cases.

For alll other intrinsic types from f16 to the vectors types, those need to be explicitly supported.

So i'm not sure how this pr could have generate the intrinsic you have reported unless you used __builtin_elementwise_tan, but that builtin is only being used by HLSL.

aeubanks · 2024-06-13T02:32:48Z

SLPVectorizer can introduce llvm.tan.v2f32. For example, running opt -O3 on the following introduces llvm.tan.v2f32

target datalayout = "e-m:e-p:32:32-Fi8-i64:64-v128:64:128-a:0:32-n32-S64"
target triple = "thumbv7-unknown-linux-android26"

; Function Attrs: null_pointer_is_valid
define void @foo() #0 {
bb:
  %call = call float @pluto(float 1.000000e+00)
  %call1 = call i1 @ham(float %call)
  %call2 = call float @pluto(float 0.000000e+00)
  %call3 = call i1 @ham(float %call2)
  %select = select i1 %call3, float 0.000000e+00, float 1.000000e+00
  %select4 = select i1 %call1, float 0.000000e+00, float 1.000000e+00
  store float %select, ptr null, align 4
  %call5 = call ptr null(ptr null, float %select4, float 0.000000e+00, float 0.000000e+00, float 0.000000e+00)
  ret void
}

define float @pluto(float %arg) {
bb:
  %call = call float @llvm.tan.f32(float %arg)
  ret float %call
}

define i1 @ham(float %arg) {
bb:
  %fcmp = fcmp ole float %arg, 0.000000e+00
  ret i1 %fcmp
}

; Function Attrs: nocallback nofree nosync nounwind speculatable willreturn memory(none)
declare float @llvm.tan.f32(float) #1

attributes #0 = { null_pointer_is_valid }
attributes #1 = { nocallback nofree nosync nounwind speculatable willreturn memory(none) }

farzonl · 2024-06-13T03:15:31Z

It seems like we have four options here. We can drop the def Tan : FPMathTemplate, LibBuiltin<"math.h"> builtins so no Builtin::BItanf or Builtin::BItanl in cgbuitlin switch case. That wouldn't solve the SLPVectorizer case but doesn't expose it either unless you use a clang_builtin function which should limit the impact. The tan library functions would behave as before.

Option 2 we land an arm7 backend for Tan. That could work for you, but the issue you raised with the SLPVectorizer case likely means this same issue exists for powerPC, RISCV, and potentially other backends.
current estimates are:
backends that need f16 support that don't have it

AMDGPU
ARM
NVPTX
RISCV
-Maybe MipsSEI It has a f16 but there are no tests for it.

backends that need vector support that don't have it

RISCV
AMDGPU
ARM/Thumb2
PowerPC
WebAssembly

Option 3. we guard the emitter with a target check for wasm (maybe), x86, and aarch64

    case Builtin::BItanf:
    case Builtin::BItanl:
    case Builtin::BI__builtin_tan:
    case Builtin::BI__builtin_tanf:
    case Builtin::BI__builtin_tanf16:
    case Builtin::BI__builtin_tanl:
    case Builtin::BI__builtin_tanf128: {
      switch(CGF.getTarget().getTriple().getArch()) {
           case llvm::Triple::aarch64:
           case llvm::Triple::x86:
           case llvm::Triple::x86_64:
           case llvm::Triple::wasm32:
           case llvm::Triple::wasm64:
               return RValue::get(emitUnaryMaybeConstrainedFPBuiltin(
                      *this, E, Intrinsic::tan, Intrinsic::experimental_constrained_tan));
      }
}

Option 4 we revert this change, The test cases would go stale and it also means our plans for landing constraint intrinsics needs to go in the backlog, because it means constraint intrinsics don't make sense until tan has full support across all backends.

efriedma-quic · 2024-06-13T04:57:06Z

Can we change the target-independent bits of the tan() implementation in the backend so it doesn't require each target to explicitly request that tan() needs to be expanded? It should be possible to adjust the code in TargetLoweringBase.cpp a bit so FTAN defaults to being expanded for all types.

davemgreen · 2024-06-13T04:58:53Z

If you remove tan from isTriviallyVectorizable it should prevent vectorization in the short term.

It might be better to default FTAN to expand in

llvm-project/llvm/lib/CodeGen/TargetLoweringBase.cpp

Line 960 in 64c9a1e

if (VT.isVector())

, which seems to only be done for f32/f64/f128 at the moment.

farzonl · 2024-06-13T07:24:45Z

Can we change the target-independent bits of the tan() implementation in the backend so it doesn't require each target to explicitly request that tan() needs to be expanded? It should be possible to adjust the code in TargetLoweringBase.cpp a bit so FTAN defaults to being expanded for all types.

It might be better to default FTAN to expand in

llvm-project/llvm/lib/CodeGen/TargetLoweringBase.cpp

Line 960 in 64c9a1e

if (VT.isVector())

, which seems to only be done for f32/f64/f128 at the moment.

I don't think we can default on vector types to Expand as it would expand on vector that are not supported for tan. For example VT in this case is MVT::all_valuetypes() That includes integer vectors.

llvm-project/llvm/lib/CodeGen/TargetLoweringBase.cpp

Line 894 in 64c9a1e

for (MVT VT : MVT::all_valuetypes()) {

Even if it only included floats it would not necessarily be correct For example in ARM

llvm-project/llvm/lib/Target/ARM/ARMISelLowering.cpp

Lines 324 to 325 in 64c9a1e

    
           const MVT FloatTypes[] = { MVT::v8f16, MVT::v4f32 }; 
        
           for (auto VT : FloatTypes) {

some cases like MVE it is MVT::v8f16, MVT::v4f32
but for NEON MVT::v2f64, MVT::v4f32, and MVT::v2f32 are supported. Doing it per target seems intentional so as to have more precision so only intrinsics that are valid to lower get expanded.

davemgreen · 2024-06-13T09:08:05Z

Usually when new ISD nodes are added they are expanded for all types, so that every backend will get at least working code even if it is not optimal. The targets can then come along and override the defaults for the types they are interested in, to get better results.

For tan I would expect most vector types would want to scalarize, so marking them as expand would make sense. If more types than are necessary get marked as Expand that shouldn't be an issue, it looks like we already do that for a number of other nodes.

aeubanks · 2024-06-13T16:11:34Z

@alexey-bataev

farzonl · 2024-06-13T20:02:46Z

Usually when new ISD nodes are added they are expanded for all types, so that every backend will get at least working code even if it is not optimal. The targets can then come along and override the defaults for the types they are interested in, to get better results.

For tan I would expect most vector types would want to scalarize, so marking them as expand would make sense. If more types than are necessary get marked as Expand that shouldn't be an issue, it looks like we already do that for a number of other nodes.

Should this be considered a termporary change until their is parity across alll backends? I'm trying to understand why no on else has done this for sin\cos\log\exp\etc.

davemgreen · 2024-06-13T20:21:28Z

I believe they were added so long ago that the default Expanding wasn't done at the time. @efriedma-quic do you have more of an idea than that?

efriedma-quic · 2024-06-13T20:56:16Z

Yes, I think it's just a historical mistake; sin/cos/log/exp were added a very long time ago, and we weren't as careful about that sort of thing. And nobody has taken the time to try to cleanup the current defaults.

farzonl · 2024-06-14T14:48:47Z

PR #95518 should address all backends.

Also SLPVectorizer was disabled for tan so I put out PR #95517

#95517) This PR is intended to address the limited SLPVectorizer support of tan raised in the comments of this PR: #94559. Right now emitting the tan intrinsisic allows you to vectorize tan, but emitting the libfunc does not. to address this the libcall needs to be mapped to the intrinsic. and the libcall and function name need to be marked approriately so they can be optimized or defined as a call lowering.

llvm#95517) This PR is intended to address the limited SLPVectorizer support of tan raised in the comments of this PR: llvm#94559. Right now emitting the tan intrinsisic allows you to vectorize tan, but emitting the libfunc does not. to address this the libcall needs to be mapped to the intrinsic. and the libcall and function name need to be marked approriately so they can be optimized or defined as a call lowering.

…#98755) ## Change: - WebAssemblyRuntimeLibcallSignatures.cpp: Expose the RTLIB's for use by WASM - Add trig specific test cases ## History This change is part of an implementation of #87367 investigation on supporting IEEE math operations as intrinsics. Which was discussed in this RFC: https://discourse.llvm.org/t/rfc-all-the-math-intrinsics/78294 This change adds wasm lowering cases for `acos`, `asin`, `atan`, `cosh`, `sinh`, and `tanh`. #70079 #70080 #70081 #70083 #70084 #95966 ## Why Web Assembly? From past changes to try and support constraint intrinsics the changes to the trig builtins to emit intrinsics\constraint intrinsics broke the WASM build. This is an attempt to preempt any such build break. - #95082 - #94559 (comment)

farzonl requested review from ilovepi and efriedma-quic June 6, 2024 02:44

llvmbot added clang Clang issues not falling into any other category backend:X86 clang:frontend Language frontend issues, e.g. anything involving "Sema" clang:codegen IR generation bugs: mangling, exceptions, etc. llvm:ir labels Jun 6, 2024

farzonl self-assigned this Jun 6, 2024

farzonl mentioned this pull request Jun 6, 2024

[arm64] Add tan intrinsic lowering #94545

Merged

llvmbot added the backend:AArch64 label Jun 6, 2024

[clang] Add tanf16 builtin and support for tan constrained intrinsic

51247e4

bob80905 reviewed Jun 7, 2024

View reviewed changes

clang/lib/CodeGen/CGBuiltin.cpp Show resolved Hide resolved

farzonl force-pushed the add-tan-constrained-intrinsic branch from ce4e2a0 to 35bbeed Compare June 7, 2024 20:19

llvmbot added the backend:ARM label Jun 7, 2024

farzonl commented Jun 7, 2024

View reviewed changes

llvm/test/CodeGen/X86/vector-constrained-fp-intrinsics.ll Show resolved Hide resolved

farzonl commented Jun 7, 2024

View reviewed changes

llvm/test/Feature/fp-intrinsics.ll Show resolved Hide resolved

1. fix typos

af44e78

2. add arm64 and x86 tests. 3. fix missing ISD::STRICT_FTAN in aarch64

farzonl force-pushed the add-tan-constrained-intrinsic branch from 35bbeed to af44e78 Compare June 7, 2024 23:11

Add remaning tan constraint intrinsic tests for PowerPC, RISCV, and S…

de01098

…ystemZ

llvmbot added backend:PowerPC backend:SystemZ labels Jun 8, 2024

efriedma-quic approved these changes Jun 10, 2024

View reviewed changes

farzonl merged commit 189d471 into llvm:main Jun 11, 2024
10 checks passed

farzonl mentioned this pull request Jun 11, 2024

[WASM] Fix for wasi libc build break add tan to RuntimeLibcallSignatureTable #95082

Merged

HerrCai0907 mentioned this pull request Jun 13, 2024

tidy #95384

Closed

farzonl mentioned this pull request Jun 13, 2024

[ARM] Add tan intrinsic lowering #95439

Merged

farzonl mentioned this pull request Jun 26, 2024

[SLPVectorizer] Support SLPVectorizer cases of tan across all backends #95517

Merged

farzonl mentioned this pull request Jul 13, 2024

[WebAssembly] Add Support for Arc and Hyperbolic trig llvm intrinsics #98755

Merged

farzonl deleted the add-tan-constrained-intrinsic branch July 14, 2024 08:51

[clang] Reland Add tanf16 builtin and support for tan constrained intrinsic #94559

[clang] Reland Add tanf16 builtin and support for tan constrained intrinsic #94559

Uh oh!

Conversation

farzonl commented Jun 6, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

llvmbot commented Jun 6, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

llvmbot commented Jun 6, 2024

Uh oh!

farzonl commented Jun 6, 2024

Uh oh!

bob80905 commented Jun 7, 2024

Uh oh!

Uh oh!

ilovepi commented Jun 7, 2024

Uh oh!

farzonl commented Jun 7, 2024

Uh oh!

Uh oh!

Uh oh!

farzonl commented Jun 10, 2024

Uh oh!

efriedma-quic left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

farzonl commented Jun 11, 2024

Uh oh!

aeubanks commented Jun 12, 2024

Uh oh!

farzonl commented Jun 12, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

aeubanks commented Jun 13, 2024

Uh oh!

farzonl commented Jun 13, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

efriedma-quic commented Jun 13, 2024

Uh oh!

davemgreen commented Jun 13, 2024

Uh oh!

farzonl commented Jun 13, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

davemgreen commented Jun 13, 2024

Uh oh!

aeubanks commented Jun 13, 2024

Uh oh!

farzonl commented Jun 13, 2024

Uh oh!

davemgreen commented Jun 13, 2024

Uh oh!

efriedma-quic commented Jun 13, 2024

Uh oh!

farzonl commented Jun 14, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

farzonl commented Jun 6, 2024 •

edited

Loading

llvmbot commented Jun 6, 2024 •

edited

Loading

farzonl commented Jun 12, 2024 •

edited

Loading

farzonl commented Jun 13, 2024 •

edited

Loading

farzonl commented Jun 13, 2024 •

edited

Loading

farzonl commented Jun 14, 2024 •

edited

Loading