-
Notifications
You must be signed in to change notification settings - Fork 13.5k
[HLSL] Implement elementwise firstbitlow builtin #116858
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Thank you for submitting a Pull Request (PR) to the LLVM Project! This PR will be automatically labeled and the relevant teams will be notified. If you wish to, you can add reviewers by using the "Reviewers" section on this page. If this is not working for you, it is probably because you do not have write permissions for the repository. In which case you can instead tag reviewers by name in a comment by using If you have received no comments on your PR for a week, you can request a review by "ping"ing the PR by adding a comment “Ping”. The common courtesy "ping" rate is once a week. Please remember that you are asking for valuable time from other developers. If you have further questions, they may be answered by the LLVM GitHub User Guide. You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums. |
@llvm/pr-subscribers-llvm-ir @llvm/pr-subscribers-clang-codegen Author: Ashley Coleman (V-FEXrt) ChangesCloses #99116
Patch is 33.49 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/116858.diff 16 Files Affected:
diff --git a/clang/include/clang/Basic/Builtins.td b/clang/include/clang/Basic/Builtins.td
index e866605ac05c09..afe0a311f236d8 100644
--- a/clang/include/clang/Basic/Builtins.td
+++ b/clang/include/clang/Basic/Builtins.td
@@ -4810,6 +4810,12 @@ def HLSLFirstBitHigh : LangBuiltin<"HLSL_LANG"> {
let Prototype = "void(...)";
}
+def HLSLFirstBitLow : LangBuiltin<"HLSL_LANG"> {
+ let Spellings = ["__builtin_hlsl_elementwise_firstbitlow"];
+ let Attributes = [NoThrow, Const];
+ let Prototype = "void(...)";
+}
+
def HLSLFrac : LangBuiltin<"HLSL_LANG"> {
let Spellings = ["__builtin_hlsl_elementwise_frac"];
let Attributes = [NoThrow, Const];
diff --git a/clang/lib/CodeGen/CGBuiltin.cpp b/clang/lib/CodeGen/CGBuiltin.cpp
index 9e0c0bff0125c0..a65b96684f18eb 100644
--- a/clang/lib/CodeGen/CGBuiltin.cpp
+++ b/clang/lib/CodeGen/CGBuiltin.cpp
@@ -18956,7 +18956,6 @@ Value *CodeGenFunction::EmitHLSLBuiltinExpr(unsigned BuiltinID,
"hlsl.dot4add.u8packed");
}
case Builtin::BI__builtin_hlsl_elementwise_firstbithigh: {
-
Value *X = EmitScalarExpr(E->getArg(0));
return Builder.CreateIntrinsic(
@@ -18964,6 +18963,14 @@ Value *CodeGenFunction::EmitHLSLBuiltinExpr(unsigned BuiltinID,
getFirstBitHighIntrinsic(CGM.getHLSLRuntime(), E->getArg(0)->getType()),
ArrayRef<Value *>{X}, nullptr, "hlsl.firstbithigh");
}
+ case Builtin::BI__builtin_hlsl_elementwise_firstbitlow: {
+ Value *X = EmitScalarExpr(E->getArg(0));
+
+ return Builder.CreateIntrinsic(
+ /*ReturnType=*/ConvertType(E->getType()),
+ CGM.getHLSLRuntime().getFirstBitLowIntrinsic(), ArrayRef<Value *>{X},
+ nullptr, "hlsl.firstbitlow");
+ }
case Builtin::BI__builtin_hlsl_lerp: {
Value *X = EmitScalarExpr(E->getArg(0));
Value *Y = EmitScalarExpr(E->getArg(1));
diff --git a/clang/lib/CodeGen/CGHLSLRuntime.h b/clang/lib/CodeGen/CGHLSLRuntime.h
index 381a5959ec098e..c9eb1b08ff6ba6 100644
--- a/clang/lib/CodeGen/CGHLSLRuntime.h
+++ b/clang/lib/CodeGen/CGHLSLRuntime.h
@@ -96,6 +96,7 @@ class CGHLSLRuntime {
GENERATE_HLSL_INTRINSIC_FUNCTION(WaveReadLaneAt, wave_readlane)
GENERATE_HLSL_INTRINSIC_FUNCTION(FirstBitUHigh, firstbituhigh)
GENERATE_HLSL_INTRINSIC_FUNCTION(FirstBitSHigh, firstbitshigh)
+ GENERATE_HLSL_INTRINSIC_FUNCTION(FirstBitLow, firstbitlow)
GENERATE_HLSL_INTRINSIC_FUNCTION(NClamp, nclamp)
GENERATE_HLSL_INTRINSIC_FUNCTION(SClamp, sclamp)
GENERATE_HLSL_INTRINSIC_FUNCTION(UClamp, uclamp)
diff --git a/clang/lib/Headers/hlsl/hlsl_intrinsics.h b/clang/lib/Headers/hlsl/hlsl_intrinsics.h
index 2ee3827d720495..610ddcea203d90 100644
--- a/clang/lib/Headers/hlsl/hlsl_intrinsics.h
+++ b/clang/lib/Headers/hlsl/hlsl_intrinsics.h
@@ -1086,6 +1086,78 @@ uint3 firstbithigh(uint64_t3);
_HLSL_BUILTIN_ALIAS(__builtin_hlsl_elementwise_firstbithigh)
uint4 firstbithigh(uint64_t4);
+//===----------------------------------------------------------------------===//
+// firstbitlow builtins
+//===----------------------------------------------------------------------===//
+
+/// \fn T firstbitlow(T Val)
+/// \brief Returns the location of the first set bit starting from the lowest
+/// order bit and working upward, per component.
+/// \param Val the input value.
+
+#ifdef __HLSL_ENABLE_16_BIT
+_HLSL_AVAILABILITY(shadermodel, 6.2)
+_HLSL_BUILTIN_ALIAS(__builtin_hlsl_elementwise_firstbitlow)
+uint firstbitlow(int16_t);
+_HLSL_AVAILABILITY(shadermodel, 6.2)
+_HLSL_BUILTIN_ALIAS(__builtin_hlsl_elementwise_firstbitlow)
+uint2 firstbitlow(int16_t2);
+_HLSL_AVAILABILITY(shadermodel, 6.2)
+_HLSL_BUILTIN_ALIAS(__builtin_hlsl_elementwise_firstbitlow)
+uint3 firstbitlow(int16_t3);
+_HLSL_AVAILABILITY(shadermodel, 6.2)
+_HLSL_BUILTIN_ALIAS(__builtin_hlsl_elementwise_firstbitlow)
+uint4 firstbitlow(int16_t4);
+_HLSL_AVAILABILITY(shadermodel, 6.2)
+_HLSL_BUILTIN_ALIAS(__builtin_hlsl_elementwise_firstbitlow)
+uint firstbitlow(uint16_t);
+_HLSL_AVAILABILITY(shadermodel, 6.2)
+_HLSL_BUILTIN_ALIAS(__builtin_hlsl_elementwise_firstbitlow)
+uint2 firstbitlow(uint16_t2);
+_HLSL_AVAILABILITY(shadermodel, 6.2)
+_HLSL_BUILTIN_ALIAS(__builtin_hlsl_elementwise_firstbitlow)
+uint3 firstbitlow(uint16_t3);
+_HLSL_AVAILABILITY(shadermodel, 6.2)
+_HLSL_BUILTIN_ALIAS(__builtin_hlsl_elementwise_firstbitlow)
+uint4 firstbitlow(uint16_t4);
+#endif
+
+_HLSL_BUILTIN_ALIAS(__builtin_hlsl_elementwise_firstbitlow)
+uint firstbitlow(int);
+_HLSL_BUILTIN_ALIAS(__builtin_hlsl_elementwise_firstbitlow)
+uint2 firstbitlow(int2);
+_HLSL_BUILTIN_ALIAS(__builtin_hlsl_elementwise_firstbitlow)
+uint3 firstbitlow(int3);
+_HLSL_BUILTIN_ALIAS(__builtin_hlsl_elementwise_firstbitlow)
+uint4 firstbitlow(int4);
+
+_HLSL_BUILTIN_ALIAS(__builtin_hlsl_elementwise_firstbitlow)
+uint firstbitlow(uint);
+_HLSL_BUILTIN_ALIAS(__builtin_hlsl_elementwise_firstbitlow)
+uint2 firstbitlow(uint2);
+_HLSL_BUILTIN_ALIAS(__builtin_hlsl_elementwise_firstbitlow)
+uint3 firstbitlow(uint3);
+_HLSL_BUILTIN_ALIAS(__builtin_hlsl_elementwise_firstbitlow)
+uint4 firstbitlow(uint4);
+
+_HLSL_BUILTIN_ALIAS(__builtin_hlsl_elementwise_firstbitlow)
+uint firstbitlow(int64_t);
+_HLSL_BUILTIN_ALIAS(__builtin_hlsl_elementwise_firstbitlow)
+uint2 firstbitlow(int64_t2);
+_HLSL_BUILTIN_ALIAS(__builtin_hlsl_elementwise_firstbitlow)
+uint3 firstbitlow(int64_t3);
+_HLSL_BUILTIN_ALIAS(__builtin_hlsl_elementwise_firstbitlow)
+uint4 firstbitlow(int64_t4);
+
+_HLSL_BUILTIN_ALIAS(__builtin_hlsl_elementwise_firstbitlow)
+uint firstbitlow(uint64_t);
+_HLSL_BUILTIN_ALIAS(__builtin_hlsl_elementwise_firstbitlow)
+uint2 firstbitlow(uint64_t2);
+_HLSL_BUILTIN_ALIAS(__builtin_hlsl_elementwise_firstbitlow)
+uint3 firstbitlow(uint64_t3);
+_HLSL_BUILTIN_ALIAS(__builtin_hlsl_elementwise_firstbitlow)
+uint4 firstbitlow(uint64_t4);
+
//===----------------------------------------------------------------------===//
// floor builtins
//===----------------------------------------------------------------------===//
diff --git a/clang/lib/Sema/SemaHLSL.cpp b/clang/lib/Sema/SemaHLSL.cpp
index 65b0d9cd65637f..c90ba9ae9e0f84 100644
--- a/clang/lib/Sema/SemaHLSL.cpp
+++ b/clang/lib/Sema/SemaHLSL.cpp
@@ -1947,7 +1947,8 @@ bool SemaHLSL::CheckBuiltinFunctionCall(unsigned BuiltinID, CallExpr *TheCall) {
return true;
break;
}
- case Builtin::BI__builtin_hlsl_elementwise_firstbithigh: {
+ case Builtin::BI__builtin_hlsl_elementwise_firstbithigh:
+ case Builtin::BI__builtin_hlsl_elementwise_firstbitlow: {
if (SemaRef.PrepareBuiltinElementwiseMathOneArgCall(TheCall))
return true;
diff --git a/clang/test/CodeGenHLSL/builtins/firstbitlow.hlsl b/clang/test/CodeGenHLSL/builtins/firstbitlow.hlsl
new file mode 100644
index 00000000000000..5d490fabc5bc8d
--- /dev/null
+++ b/clang/test/CodeGenHLSL/builtins/firstbitlow.hlsl
@@ -0,0 +1,153 @@
+// RUN: %clang_cc1 -finclude-default-header -x hlsl -triple \
+// RUN: dxil-pc-shadermodel6.3-library %s -fnative-half-type \
+// RUN: -emit-llvm -disable-llvm-passes -o - | FileCheck %s -DTARGET=dx
+// RUN: %clang_cc1 -finclude-default-header -x hlsl -triple \
+// RUN: spirv-unknown-vulkan-compute %s -fnative-half-type \
+// RUN: -emit-llvm -disable-llvm-passes \
+// RUN: -o - | FileCheck %s -DTARGET=spv
+
+#ifdef __HLSL_ENABLE_16_BIT
+// CHECK-LABEL: test_firstbitlow_ushort
+// CHECK: call i32 @llvm.[[TARGET]].firstbitlow.i16
+uint test_firstbitlow_ushort(uint16_t p0) {
+ return firstbitlow(p0);
+}
+
+// CHECK-LABEL: test_firstbitlow_ushort2
+// CHECK: call <2 x i32> @llvm.[[TARGET]].firstbitlow.v2i16
+uint2 test_firstbitlow_ushort2(uint16_t2 p0) {
+ return firstbitlow(p0);
+}
+
+// CHECK-LABEL: test_firstbitlow_ushort3
+// CHECK: call <3 x i32> @llvm.[[TARGET]].firstbitlow.v3i16
+uint3 test_firstbitlow_ushort3(uint16_t3 p0) {
+ return firstbitlow(p0);
+}
+
+// CHECK-LABEL: test_firstbitlow_ushort4
+// CHECK: call <4 x i32> @llvm.[[TARGET]].firstbitlow.v4i16
+uint4 test_firstbitlow_ushort4(uint16_t4 p0) {
+ return firstbitlow(p0);
+}
+
+// CHECK-LABEL: test_firstbitlow_short
+// CHECK: call i32 @llvm.[[TARGET]].firstbitlow.i16
+uint test_firstbitlow_short(int16_t p0) {
+ return firstbitlow(p0);
+}
+
+// CHECK-LABEL: test_firstbitlow_short2
+// CHECK: call <2 x i32> @llvm.[[TARGET]].firstbitlow.v2i16
+uint2 test_firstbitlow_short2(int16_t2 p0) {
+ return firstbitlow(p0);
+}
+
+// CHECK-LABEL: test_firstbitlow_short3
+// CHECK: call <3 x i32> @llvm.[[TARGET]].firstbitlow.v3i16
+uint3 test_firstbitlow_short3(int16_t3 p0) {
+ return firstbitlow(p0);
+}
+
+// CHECK-LABEL: test_firstbitlow_short4
+// CHECK: call <4 x i32> @llvm.[[TARGET]].firstbitlow.v4i16
+uint4 test_firstbitlow_short4(int16_t4 p0) {
+ return firstbitlow(p0);
+}
+#endif // __HLSL_ENABLE_16_BIT
+
+// CHECK-LABEL: test_firstbitlow_uint
+// CHECK: call i32 @llvm.[[TARGET]].firstbitlow.i32
+uint test_firstbitlow_uint(uint p0) {
+ return firstbitlow(p0);
+}
+
+// CHECK-LABEL: test_firstbitlow_uint2
+// CHECK: call <2 x i32> @llvm.[[TARGET]].firstbitlow.v2i32
+uint2 test_firstbitlow_uint2(uint2 p0) {
+ return firstbitlow(p0);
+}
+
+// CHECK-LABEL: test_firstbitlow_uint3
+// CHECK: call <3 x i32> @llvm.[[TARGET]].firstbitlow.v3i32
+uint3 test_firstbitlow_uint3(uint3 p0) {
+ return firstbitlow(p0);
+}
+
+// CHECK-LABEL: test_firstbitlow_uint4
+// CHECK: call <4 x i32> @llvm.[[TARGET]].firstbitlow.v4i32
+uint4 test_firstbitlow_uint4(uint4 p0) {
+ return firstbitlow(p0);
+}
+
+// CHECK-LABEL: test_firstbitlow_ulong
+// CHECK: call i32 @llvm.[[TARGET]].firstbitlow.i64
+uint test_firstbitlow_ulong(uint64_t p0) {
+ return firstbitlow(p0);
+}
+
+// CHECK-LABEL: test_firstbitlow_ulong2
+// CHECK: call <2 x i32> @llvm.[[TARGET]].firstbitlow.v2i64
+uint2 test_firstbitlow_ulong2(uint64_t2 p0) {
+ return firstbitlow(p0);
+}
+
+// CHECK-LABEL: test_firstbitlow_ulong3
+// CHECK: call <3 x i32> @llvm.[[TARGET]].firstbitlow.v3i64
+uint3 test_firstbitlow_ulong3(uint64_t3 p0) {
+ return firstbitlow(p0);
+}
+
+// CHECK-LABEL: test_firstbitlow_ulong4
+// CHECK: call <4 x i32> @llvm.[[TARGET]].firstbitlow.v4i64
+uint4 test_firstbitlow_ulong4(uint64_t4 p0) {
+ return firstbitlow(p0);
+}
+
+// CHECK-LABEL: test_firstbitlow_int
+// CHECK: call i32 @llvm.[[TARGET]].firstbitlow.i32
+uint test_firstbitlow_int(int p0) {
+ return firstbitlow(p0);
+}
+
+// CHECK-LABEL: test_firstbitlow_int2
+// CHECK: call <2 x i32> @llvm.[[TARGET]].firstbitlow.v2i32
+uint2 test_firstbitlow_int2(int2 p0) {
+ return firstbitlow(p0);
+}
+
+// CHECK-LABEL: test_firstbitlow_int3
+// CHECK: call <3 x i32> @llvm.[[TARGET]].firstbitlow.v3i32
+uint3 test_firstbitlow_int3(int3 p0) {
+ return firstbitlow(p0);
+}
+
+// CHECK-LABEL: test_firstbitlow_int4
+// CHECK: call <4 x i32> @llvm.[[TARGET]].firstbitlow.v4i32
+uint4 test_firstbitlow_int4(int4 p0) {
+ return firstbitlow(p0);
+}
+
+// CHECK-LABEL: test_firstbitlow_long
+// CHECK: call i32 @llvm.[[TARGET]].firstbitlow.i64
+uint test_firstbitlow_long(int64_t p0) {
+ return firstbitlow(p0);
+}
+
+// CHECK-LABEL: test_firstbitlow_long2
+// CHECK: call <2 x i32> @llvm.[[TARGET]].firstbitlow.v2i64
+uint2 test_firstbitlow_long2(int64_t2 p0) {
+ return firstbitlow(p0);
+}
+
+// CHECK-LABEL: test_firstbitlow_long3
+// CHECK: call <3 x i32> @llvm.[[TARGET]].firstbitlow.v3i64
+uint3 test_firstbitlow_long3(int64_t3 p0) {
+ return firstbitlow(p0);
+}
+
+// CHECK-LABEL: test_firstbitlow_long4
+// CHECK: call <4 x i32> @llvm.[[TARGET]].firstbitlow.v4i64
+uint4 test_firstbitlow_long4(int64_t4 p0) {
+ return firstbitlow(p0);
+}
diff --git a/clang/test/SemaHLSL/BuiltIns/firstbithigh-errors.hlsl b/clang/test/SemaHLSL/BuiltIns/firstbithigh-errors.hlsl
index 1912ab3ae806b3..b4024418dbba4f 100644
--- a/clang/test/SemaHLSL/BuiltIns/firstbithigh-errors.hlsl
+++ b/clang/test/SemaHLSL/BuiltIns/firstbithigh-errors.hlsl
@@ -17,12 +17,10 @@ double test_int_builtin(double p0) {
double2 test_int_builtin_2(double2 p0) {
return __builtin_hlsl_elementwise_firstbithigh(p0);
- // expected-error@-1 {{1st argument must be a vector of integers
- // (was 'double2' (aka 'vector<double, 2>'))}}
+ // expected-error@-1 {{1st argument must be a vector of integers (was 'double2' (aka 'vector<double, 2>'))}}
}
float test_int_builtin_3(float p0) {
return __builtin_hlsl_elementwise_firstbithigh(p0);
- // expected-error@-1 {{1st argument must be a vector of integers
- // (was 'float')}}
+ // expected-error@-1 {{1st argument must be a vector of integers (was 'double')}}
}
diff --git a/clang/test/SemaHLSL/BuiltIns/firstbitlow-errors.hlsl b/clang/test/SemaHLSL/BuiltIns/firstbitlow-errors.hlsl
new file mode 100644
index 00000000000000..95c25e9e2fb60d
--- /dev/null
+++ b/clang/test/SemaHLSL/BuiltIns/firstbitlow-errors.hlsl
@@ -0,0 +1,26 @@
+// RUN: %clang_cc1 -finclude-default-header -triple dxil-pc-shadermodel6.6-library %s -fnative-half-type -emit-llvm-only -disable-llvm-passes -verify -verify-ignore-unexpected
+
+int test_too_few_arg() {
+ return firstbitlow();
+ // expected-error@-1 {{no matching function for call to 'firstbitlow'}}
+}
+
+int test_too_many_arg(int p0) {
+ return firstbitlow(p0, p0);
+ // expected-error@-1 {{no matching function for call to 'firstbitlow'}}
+}
+
+double test_int_builtin(double p0) {
+ return firstbitlow(p0);
+ // expected-error@-1 {{call to 'firstbitlow' is ambiguous}}
+}
+
+double2 test_int_builtin_2(double2 p0) {
+ return __builtin_hlsl_elementwise_firstbitlow(p0);
+ // expected-error@-1 {{1st argument must be a vector of integers (was 'double2' (aka 'vector<double, 2>'))}}
+}
+
+float test_int_builtin_3(float p0) {
+ return __builtin_hlsl_elementwise_firstbitlow(p0);
+ // expected-error@-1 {{1st argument must be a vector of integers (was 'double')}}
+}
diff --git a/llvm/include/llvm/IR/IntrinsicsDirectX.td b/llvm/include/llvm/IR/IntrinsicsDirectX.td
index 6093664c908dc5..6148dcaff64470 100644
--- a/llvm/include/llvm/IR/IntrinsicsDirectX.td
+++ b/llvm/include/llvm/IR/IntrinsicsDirectX.td
@@ -103,4 +103,5 @@ def int_dx_splitdouble : DefaultAttrsIntrinsic<[llvm_anyint_ty, LLVMMatchType<0>
def int_dx_radians : DefaultAttrsIntrinsic<[llvm_anyfloat_ty], [LLVMMatchType<0>], [IntrNoMem]>;
def int_dx_firstbituhigh : DefaultAttrsIntrinsic<[LLVMScalarOrSameVectorWidth<0, llvm_i32_ty>], [llvm_anyint_ty], [IntrNoMem]>;
def int_dx_firstbitshigh : DefaultAttrsIntrinsic<[LLVMScalarOrSameVectorWidth<0, llvm_i32_ty>], [llvm_anyint_ty], [IntrNoMem]>;
+def int_dx_firstbitlow : DefaultAttrsIntrinsic<[LLVMScalarOrSameVectorWidth<0, llvm_i32_ty>], [llvm_anyint_ty], [IntrNoMem]>;
}
diff --git a/llvm/include/llvm/IR/IntrinsicsSPIRV.td b/llvm/include/llvm/IR/IntrinsicsSPIRV.td
index f29eb7ee22b2d2..63e99524511143 100644
--- a/llvm/include/llvm/IR/IntrinsicsSPIRV.td
+++ b/llvm/include/llvm/IR/IntrinsicsSPIRV.td
@@ -106,6 +106,7 @@ let TargetPrefix = "spv" in {
[IntrNoMem]>;
def int_spv_firstbituhigh : DefaultAttrsIntrinsic<[LLVMScalarOrSameVectorWidth<0, llvm_i32_ty>], [llvm_anyint_ty], [IntrNoMem]>;
def int_spv_firstbitshigh : DefaultAttrsIntrinsic<[LLVMScalarOrSameVectorWidth<0, llvm_i32_ty>], [llvm_anyint_ty], [IntrNoMem]>;
+ def int_spv_firstbitlow : DefaultAttrsIntrinsic<[LLVMScalarOrSameVectorWidth<0, llvm_i32_ty>], [llvm_anyint_ty], [IntrNoMem]>;
// Read a value from the image buffer. It does not translate directly to a
// single OpImageRead because the result type is not necessarily a 4 element
diff --git a/llvm/lib/Target/DirectX/DXIL.td b/llvm/lib/Target/DirectX/DXIL.td
index 078f0591a67515..efdd91c9c27f62 100644
--- a/llvm/lib/Target/DirectX/DXIL.td
+++ b/llvm/lib/Target/DirectX/DXIL.td
@@ -564,6 +564,18 @@ def CountBits : DXILOp<31, unaryBits> {
let attributes = [Attributes<DXIL1_0, [ReadNone]>];
}
+def FirstbitLo : DXILOp<32, unaryBits> {
+ let Doc = "Returns the location of the first set bit starting from "
+ "the lowest order bit and working upward.";
+ let LLVMIntrinsic = int_dx_firstbitlow;
+ let arguments = [OverloadTy];
+ let result = Int32Ty;
+ let overloads =
+ [Overloads<DXIL1_0, [Int16Ty, Int32Ty, Int64Ty]>];
+ let stages = [Stages<DXIL1_0, [all_stages]>];
+ let attributes = [Attributes<DXIL1_0, [ReadNone]>];
+}
+
def FirstbitHi : DXILOp<33, unaryBits> {
let Doc = "Returns the location of the first set bit starting from "
"the highest order bit and working downward.";
diff --git a/llvm/lib/Target/DirectX/DirectXTargetTransformInfo.cpp b/llvm/lib/Target/DirectX/DirectXTargetTransformInfo.cpp
index b0436a39423405..c37e0dbb82e5a0 100644
--- a/llvm/lib/Target/DirectX/DirectXTargetTransformInfo.cpp
+++ b/llvm/lib/Target/DirectX/DirectXTargetTransformInfo.cpp
@@ -34,6 +34,7 @@ bool DirectXTTIImpl::isTargetIntrinsicTriviallyScalarizable(
case Intrinsic::dx_splitdouble:
case Intrinsic::dx_firstbituhigh:
case Intrinsic::dx_firstbitshigh:
+ case Intrinsic::dx_firstbitlow:
return true;
default:
return false;
diff --git a/llvm/lib/Target/SPIRV/SPIRVInstructionSelector.cpp b/llvm/lib/Target/SPIRV/SPIRVInstructionSelector.cpp
index 8a8835e0269200..c033f04305d0f0 100644
--- a/llvm/lib/Target/SPIRV/SPIRVInstructionSelector.cpp
+++ b/llvm/lib/Target/SPIRV/SPIRVInstructionSelector.cpp
@@ -106,6 +106,18 @@ class SPIRVInstructionSelector : public InstructionSelector {
bool selectFirstBitHigh64(Register ResVReg, const SPIRVType *ResType,
MachineInstr &I, bool IsSigned) const;
+ bool selectFirstBitLow(Register ResVReg, const SPIRVType *ResType,
+ MachineInstr &I) const;
+
+ bool selectFirstBitLow16(Register ResVReg, const SPIRVType *ResType,
+ MachineInstr &I) const;
+
+ bool selectFirstBitLow32(Register ResVReg, const SPIRVType *ResType,
+ MachineInstr &I, Register SrcReg) const;
+
+ bool selectFirstBitLow64(Register ResVReg, const SPIRVType *ResType,
+ MachineInstr &I) const;
+
bool selectGlobalValue(Register ResVReg, MachineInstr &I,
const MachineInstr *Init = nullptr) const;
@@ -2786,6 +2798,8 @@ bool SPIRVInstructionSelector::selectIntrinsic(Register ResVReg,
return selectFirstBitHigh(ResVReg, ResType, I, /*IsSigned=*/false);
case Intrinsic::spv_firstbitshigh: // There is no CL equivalent of FindSMsb
return selectFirstBitHigh(ResVReg, ResType, I, /*IsSigned=*/true);
+ case Intrinsic::spv_firstbitlow: // There is no CL equivlent of FindILsb
+ return selectFirstBitLow(ResVReg, ResType, I);
case Intrinsic::spv_group_memory_barrier_with_group_sync: {
bool Result = true;
auto MemSemConstant =
@@ -3158,6 +3172,166 @@ bool SPIRVInstructionSelector::selectFirstBitHigh(Register ResVReg,
}
}
+bool SPIRVInstructionSelector::selectFirstBitLow16(Register ResVReg,
+ const SPIRVType *ResType,
+ MachineInstr &I) const {
+ // OpUConvert treats the operand bits as an unsigned i16 and zero extends it
+ // to an unsigned i32. As this leaves all the least significant bits unchanged
+ // the first set bit from the LSB side doesn't change.
+ Register ExtReg = MRI->createVirtualRegister(GR.getRegClass(ResType));
+ bool Result = selectNAryOpWithSrcs(
+ ExtReg, ResType, I, {I.getOperand(2).getReg()}, SPIRV::OpUConvert);
+ return Result && selectFirstBitLow32(ResVReg, ResType, I, ExtReg);
+}
+
+bool SPIRVInstructionSelector::selectFirstBitLow32(Register ResVReg,
+ const SPIRVType *ResType,
+ MachineInstr &I,
+ Register SrcReg) const {
+ return BuildMI(*I.getParent(), I, I.getDebugLoc(), TII.get(SPIRV::OpExtInst))
+ .addDef(ResVReg)
+ .addUse(GR.getSPIRVTypeID(ResType))
+ .addImm(static_cast<uint32_t>(SPIRV::InstructionSet::GLSL_std_450))
+ .addImm(GL::FindILsb)
+ .addUse(SrcReg)
+ .constrainAllUses(TII, TRI, RBI);
+}
+
+bool SPIRVInstructionSelector::selectFirstBitLow64(Register ResVReg,
+ const SPIRVType *ResType,
+ MachineInstr &I) const {
+ Register OpReg = I.getOperand(2).getReg();
+
+ // 1. Split int64 into 2 pieces using a bitcast
+ unsigned ComponentCount = GR.getScalarOrVectorComponentCount(ResType);
+ SPIRVType *BaseType = GR.retrieveScalarOrVectorIntType(ResType);
...
[truncated]
|
@llvm/pr-subscribers-clang Author: Ashley Coleman (V-FEXrt) ChangesCloses #99116
Patch is 33.49 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/116858.diff 16 Files Affected:
diff --git a/clang/include/clang/Basic/Builtins.td b/clang/include/clang/Basic/Builtins.td
index e866605ac05c09..afe0a311f236d8 100644
--- a/clang/include/clang/Basic/Builtins.td
+++ b/clang/include/clang/Basic/Builtins.td
@@ -4810,6 +4810,12 @@ def HLSLFirstBitHigh : LangBuiltin<"HLSL_LANG"> {
let Prototype = "void(...)";
}
+def HLSLFirstBitLow : LangBuiltin<"HLSL_LANG"> {
+ let Spellings = ["__builtin_hlsl_elementwise_firstbitlow"];
+ let Attributes = [NoThrow, Const];
+ let Prototype = "void(...)";
+}
+
def HLSLFrac : LangBuiltin<"HLSL_LANG"> {
let Spellings = ["__builtin_hlsl_elementwise_frac"];
let Attributes = [NoThrow, Const];
diff --git a/clang/lib/CodeGen/CGBuiltin.cpp b/clang/lib/CodeGen/CGBuiltin.cpp
index 9e0c0bff0125c0..a65b96684f18eb 100644
--- a/clang/lib/CodeGen/CGBuiltin.cpp
+++ b/clang/lib/CodeGen/CGBuiltin.cpp
@@ -18956,7 +18956,6 @@ Value *CodeGenFunction::EmitHLSLBuiltinExpr(unsigned BuiltinID,
"hlsl.dot4add.u8packed");
}
case Builtin::BI__builtin_hlsl_elementwise_firstbithigh: {
-
Value *X = EmitScalarExpr(E->getArg(0));
return Builder.CreateIntrinsic(
@@ -18964,6 +18963,14 @@ Value *CodeGenFunction::EmitHLSLBuiltinExpr(unsigned BuiltinID,
getFirstBitHighIntrinsic(CGM.getHLSLRuntime(), E->getArg(0)->getType()),
ArrayRef<Value *>{X}, nullptr, "hlsl.firstbithigh");
}
+ case Builtin::BI__builtin_hlsl_elementwise_firstbitlow: {
+ Value *X = EmitScalarExpr(E->getArg(0));
+
+ return Builder.CreateIntrinsic(
+ /*ReturnType=*/ConvertType(E->getType()),
+ CGM.getHLSLRuntime().getFirstBitLowIntrinsic(), ArrayRef<Value *>{X},
+ nullptr, "hlsl.firstbitlow");
+ }
case Builtin::BI__builtin_hlsl_lerp: {
Value *X = EmitScalarExpr(E->getArg(0));
Value *Y = EmitScalarExpr(E->getArg(1));
diff --git a/clang/lib/CodeGen/CGHLSLRuntime.h b/clang/lib/CodeGen/CGHLSLRuntime.h
index 381a5959ec098e..c9eb1b08ff6ba6 100644
--- a/clang/lib/CodeGen/CGHLSLRuntime.h
+++ b/clang/lib/CodeGen/CGHLSLRuntime.h
@@ -96,6 +96,7 @@ class CGHLSLRuntime {
GENERATE_HLSL_INTRINSIC_FUNCTION(WaveReadLaneAt, wave_readlane)
GENERATE_HLSL_INTRINSIC_FUNCTION(FirstBitUHigh, firstbituhigh)
GENERATE_HLSL_INTRINSIC_FUNCTION(FirstBitSHigh, firstbitshigh)
+ GENERATE_HLSL_INTRINSIC_FUNCTION(FirstBitLow, firstbitlow)
GENERATE_HLSL_INTRINSIC_FUNCTION(NClamp, nclamp)
GENERATE_HLSL_INTRINSIC_FUNCTION(SClamp, sclamp)
GENERATE_HLSL_INTRINSIC_FUNCTION(UClamp, uclamp)
diff --git a/clang/lib/Headers/hlsl/hlsl_intrinsics.h b/clang/lib/Headers/hlsl/hlsl_intrinsics.h
index 2ee3827d720495..610ddcea203d90 100644
--- a/clang/lib/Headers/hlsl/hlsl_intrinsics.h
+++ b/clang/lib/Headers/hlsl/hlsl_intrinsics.h
@@ -1086,6 +1086,78 @@ uint3 firstbithigh(uint64_t3);
_HLSL_BUILTIN_ALIAS(__builtin_hlsl_elementwise_firstbithigh)
uint4 firstbithigh(uint64_t4);
+//===----------------------------------------------------------------------===//
+// firstbitlow builtins
+//===----------------------------------------------------------------------===//
+
+/// \fn T firstbitlow(T Val)
+/// \brief Returns the location of the first set bit starting from the lowest
+/// order bit and working upward, per component.
+/// \param Val the input value.
+
+#ifdef __HLSL_ENABLE_16_BIT
+_HLSL_AVAILABILITY(shadermodel, 6.2)
+_HLSL_BUILTIN_ALIAS(__builtin_hlsl_elementwise_firstbitlow)
+uint firstbitlow(int16_t);
+_HLSL_AVAILABILITY(shadermodel, 6.2)
+_HLSL_BUILTIN_ALIAS(__builtin_hlsl_elementwise_firstbitlow)
+uint2 firstbitlow(int16_t2);
+_HLSL_AVAILABILITY(shadermodel, 6.2)
+_HLSL_BUILTIN_ALIAS(__builtin_hlsl_elementwise_firstbitlow)
+uint3 firstbitlow(int16_t3);
+_HLSL_AVAILABILITY(shadermodel, 6.2)
+_HLSL_BUILTIN_ALIAS(__builtin_hlsl_elementwise_firstbitlow)
+uint4 firstbitlow(int16_t4);
+_HLSL_AVAILABILITY(shadermodel, 6.2)
+_HLSL_BUILTIN_ALIAS(__builtin_hlsl_elementwise_firstbitlow)
+uint firstbitlow(uint16_t);
+_HLSL_AVAILABILITY(shadermodel, 6.2)
+_HLSL_BUILTIN_ALIAS(__builtin_hlsl_elementwise_firstbitlow)
+uint2 firstbitlow(uint16_t2);
+_HLSL_AVAILABILITY(shadermodel, 6.2)
+_HLSL_BUILTIN_ALIAS(__builtin_hlsl_elementwise_firstbitlow)
+uint3 firstbitlow(uint16_t3);
+_HLSL_AVAILABILITY(shadermodel, 6.2)
+_HLSL_BUILTIN_ALIAS(__builtin_hlsl_elementwise_firstbitlow)
+uint4 firstbitlow(uint16_t4);
+#endif
+
+_HLSL_BUILTIN_ALIAS(__builtin_hlsl_elementwise_firstbitlow)
+uint firstbitlow(int);
+_HLSL_BUILTIN_ALIAS(__builtin_hlsl_elementwise_firstbitlow)
+uint2 firstbitlow(int2);
+_HLSL_BUILTIN_ALIAS(__builtin_hlsl_elementwise_firstbitlow)
+uint3 firstbitlow(int3);
+_HLSL_BUILTIN_ALIAS(__builtin_hlsl_elementwise_firstbitlow)
+uint4 firstbitlow(int4);
+
+_HLSL_BUILTIN_ALIAS(__builtin_hlsl_elementwise_firstbitlow)
+uint firstbitlow(uint);
+_HLSL_BUILTIN_ALIAS(__builtin_hlsl_elementwise_firstbitlow)
+uint2 firstbitlow(uint2);
+_HLSL_BUILTIN_ALIAS(__builtin_hlsl_elementwise_firstbitlow)
+uint3 firstbitlow(uint3);
+_HLSL_BUILTIN_ALIAS(__builtin_hlsl_elementwise_firstbitlow)
+uint4 firstbitlow(uint4);
+
+_HLSL_BUILTIN_ALIAS(__builtin_hlsl_elementwise_firstbitlow)
+uint firstbitlow(int64_t);
+_HLSL_BUILTIN_ALIAS(__builtin_hlsl_elementwise_firstbitlow)
+uint2 firstbitlow(int64_t2);
+_HLSL_BUILTIN_ALIAS(__builtin_hlsl_elementwise_firstbitlow)
+uint3 firstbitlow(int64_t3);
+_HLSL_BUILTIN_ALIAS(__builtin_hlsl_elementwise_firstbitlow)
+uint4 firstbitlow(int64_t4);
+
+_HLSL_BUILTIN_ALIAS(__builtin_hlsl_elementwise_firstbitlow)
+uint firstbitlow(uint64_t);
+_HLSL_BUILTIN_ALIAS(__builtin_hlsl_elementwise_firstbitlow)
+uint2 firstbitlow(uint64_t2);
+_HLSL_BUILTIN_ALIAS(__builtin_hlsl_elementwise_firstbitlow)
+uint3 firstbitlow(uint64_t3);
+_HLSL_BUILTIN_ALIAS(__builtin_hlsl_elementwise_firstbitlow)
+uint4 firstbitlow(uint64_t4);
+
//===----------------------------------------------------------------------===//
// floor builtins
//===----------------------------------------------------------------------===//
diff --git a/clang/lib/Sema/SemaHLSL.cpp b/clang/lib/Sema/SemaHLSL.cpp
index 65b0d9cd65637f..c90ba9ae9e0f84 100644
--- a/clang/lib/Sema/SemaHLSL.cpp
+++ b/clang/lib/Sema/SemaHLSL.cpp
@@ -1947,7 +1947,8 @@ bool SemaHLSL::CheckBuiltinFunctionCall(unsigned BuiltinID, CallExpr *TheCall) {
return true;
break;
}
- case Builtin::BI__builtin_hlsl_elementwise_firstbithigh: {
+ case Builtin::BI__builtin_hlsl_elementwise_firstbithigh:
+ case Builtin::BI__builtin_hlsl_elementwise_firstbitlow: {
if (SemaRef.PrepareBuiltinElementwiseMathOneArgCall(TheCall))
return true;
diff --git a/clang/test/CodeGenHLSL/builtins/firstbitlow.hlsl b/clang/test/CodeGenHLSL/builtins/firstbitlow.hlsl
new file mode 100644
index 00000000000000..5d490fabc5bc8d
--- /dev/null
+++ b/clang/test/CodeGenHLSL/builtins/firstbitlow.hlsl
@@ -0,0 +1,153 @@
+// RUN: %clang_cc1 -finclude-default-header -x hlsl -triple \
+// RUN: dxil-pc-shadermodel6.3-library %s -fnative-half-type \
+// RUN: -emit-llvm -disable-llvm-passes -o - | FileCheck %s -DTARGET=dx
+// RUN: %clang_cc1 -finclude-default-header -x hlsl -triple \
+// RUN: spirv-unknown-vulkan-compute %s -fnative-half-type \
+// RUN: -emit-llvm -disable-llvm-passes \
+// RUN: -o - | FileCheck %s -DTARGET=spv
+
+#ifdef __HLSL_ENABLE_16_BIT
+// CHECK-LABEL: test_firstbitlow_ushort
+// CHECK: call i32 @llvm.[[TARGET]].firstbitlow.i16
+uint test_firstbitlow_ushort(uint16_t p0) {
+ return firstbitlow(p0);
+}
+
+// CHECK-LABEL: test_firstbitlow_ushort2
+// CHECK: call <2 x i32> @llvm.[[TARGET]].firstbitlow.v2i16
+uint2 test_firstbitlow_ushort2(uint16_t2 p0) {
+ return firstbitlow(p0);
+}
+
+// CHECK-LABEL: test_firstbitlow_ushort3
+// CHECK: call <3 x i32> @llvm.[[TARGET]].firstbitlow.v3i16
+uint3 test_firstbitlow_ushort3(uint16_t3 p0) {
+ return firstbitlow(p0);
+}
+
+// CHECK-LABEL: test_firstbitlow_ushort4
+// CHECK: call <4 x i32> @llvm.[[TARGET]].firstbitlow.v4i16
+uint4 test_firstbitlow_ushort4(uint16_t4 p0) {
+ return firstbitlow(p0);
+}
+
+// CHECK-LABEL: test_firstbitlow_short
+// CHECK: call i32 @llvm.[[TARGET]].firstbitlow.i16
+uint test_firstbitlow_short(int16_t p0) {
+ return firstbitlow(p0);
+}
+
+// CHECK-LABEL: test_firstbitlow_short2
+// CHECK: call <2 x i32> @llvm.[[TARGET]].firstbitlow.v2i16
+uint2 test_firstbitlow_short2(int16_t2 p0) {
+ return firstbitlow(p0);
+}
+
+// CHECK-LABEL: test_firstbitlow_short3
+// CHECK: call <3 x i32> @llvm.[[TARGET]].firstbitlow.v3i16
+uint3 test_firstbitlow_short3(int16_t3 p0) {
+ return firstbitlow(p0);
+}
+
+// CHECK-LABEL: test_firstbitlow_short4
+// CHECK: call <4 x i32> @llvm.[[TARGET]].firstbitlow.v4i16
+uint4 test_firstbitlow_short4(int16_t4 p0) {
+ return firstbitlow(p0);
+}
+#endif // __HLSL_ENABLE_16_BIT
+
+// CHECK-LABEL: test_firstbitlow_uint
+// CHECK: call i32 @llvm.[[TARGET]].firstbitlow.i32
+uint test_firstbitlow_uint(uint p0) {
+ return firstbitlow(p0);
+}
+
+// CHECK-LABEL: test_firstbitlow_uint2
+// CHECK: call <2 x i32> @llvm.[[TARGET]].firstbitlow.v2i32
+uint2 test_firstbitlow_uint2(uint2 p0) {
+ return firstbitlow(p0);
+}
+
+// CHECK-LABEL: test_firstbitlow_uint3
+// CHECK: call <3 x i32> @llvm.[[TARGET]].firstbitlow.v3i32
+uint3 test_firstbitlow_uint3(uint3 p0) {
+ return firstbitlow(p0);
+}
+
+// CHECK-LABEL: test_firstbitlow_uint4
+// CHECK: call <4 x i32> @llvm.[[TARGET]].firstbitlow.v4i32
+uint4 test_firstbitlow_uint4(uint4 p0) {
+ return firstbitlow(p0);
+}
+
+// CHECK-LABEL: test_firstbitlow_ulong
+// CHECK: call i32 @llvm.[[TARGET]].firstbitlow.i64
+uint test_firstbitlow_ulong(uint64_t p0) {
+ return firstbitlow(p0);
+}
+
+// CHECK-LABEL: test_firstbitlow_ulong2
+// CHECK: call <2 x i32> @llvm.[[TARGET]].firstbitlow.v2i64
+uint2 test_firstbitlow_ulong2(uint64_t2 p0) {
+ return firstbitlow(p0);
+}
+
+// CHECK-LABEL: test_firstbitlow_ulong3
+// CHECK: call <3 x i32> @llvm.[[TARGET]].firstbitlow.v3i64
+uint3 test_firstbitlow_ulong3(uint64_t3 p0) {
+ return firstbitlow(p0);
+}
+
+// CHECK-LABEL: test_firstbitlow_ulong4
+// CHECK: call <4 x i32> @llvm.[[TARGET]].firstbitlow.v4i64
+uint4 test_firstbitlow_ulong4(uint64_t4 p0) {
+ return firstbitlow(p0);
+}
+
+// CHECK-LABEL: test_firstbitlow_int
+// CHECK: call i32 @llvm.[[TARGET]].firstbitlow.i32
+uint test_firstbitlow_int(int p0) {
+ return firstbitlow(p0);
+}
+
+// CHECK-LABEL: test_firstbitlow_int2
+// CHECK: call <2 x i32> @llvm.[[TARGET]].firstbitlow.v2i32
+uint2 test_firstbitlow_int2(int2 p0) {
+ return firstbitlow(p0);
+}
+
+// CHECK-LABEL: test_firstbitlow_int3
+// CHECK: call <3 x i32> @llvm.[[TARGET]].firstbitlow.v3i32
+uint3 test_firstbitlow_int3(int3 p0) {
+ return firstbitlow(p0);
+}
+
+// CHECK-LABEL: test_firstbitlow_int4
+// CHECK: call <4 x i32> @llvm.[[TARGET]].firstbitlow.v4i32
+uint4 test_firstbitlow_int4(int4 p0) {
+ return firstbitlow(p0);
+}
+
+// CHECK-LABEL: test_firstbitlow_long
+// CHECK: call i32 @llvm.[[TARGET]].firstbitlow.i64
+uint test_firstbitlow_long(int64_t p0) {
+ return firstbitlow(p0);
+}
+
+// CHECK-LABEL: test_firstbitlow_long2
+// CHECK: call <2 x i32> @llvm.[[TARGET]].firstbitlow.v2i64
+uint2 test_firstbitlow_long2(int64_t2 p0) {
+ return firstbitlow(p0);
+}
+
+// CHECK-LABEL: test_firstbitlow_long3
+// CHECK: call <3 x i32> @llvm.[[TARGET]].firstbitlow.v3i64
+uint3 test_firstbitlow_long3(int64_t3 p0) {
+ return firstbitlow(p0);
+}
+
+// CHECK-LABEL: test_firstbitlow_long4
+// CHECK: call <4 x i32> @llvm.[[TARGET]].firstbitlow.v4i64
+uint4 test_firstbitlow_long4(int64_t4 p0) {
+ return firstbitlow(p0);
+}
diff --git a/clang/test/SemaHLSL/BuiltIns/firstbithigh-errors.hlsl b/clang/test/SemaHLSL/BuiltIns/firstbithigh-errors.hlsl
index 1912ab3ae806b3..b4024418dbba4f 100644
--- a/clang/test/SemaHLSL/BuiltIns/firstbithigh-errors.hlsl
+++ b/clang/test/SemaHLSL/BuiltIns/firstbithigh-errors.hlsl
@@ -17,12 +17,10 @@ double test_int_builtin(double p0) {
double2 test_int_builtin_2(double2 p0) {
return __builtin_hlsl_elementwise_firstbithigh(p0);
- // expected-error@-1 {{1st argument must be a vector of integers
- // (was 'double2' (aka 'vector<double, 2>'))}}
+ // expected-error@-1 {{1st argument must be a vector of integers (was 'double2' (aka 'vector<double, 2>'))}}
}
float test_int_builtin_3(float p0) {
return __builtin_hlsl_elementwise_firstbithigh(p0);
- // expected-error@-1 {{1st argument must be a vector of integers
- // (was 'float')}}
+ // expected-error@-1 {{1st argument must be a vector of integers (was 'double')}}
}
diff --git a/clang/test/SemaHLSL/BuiltIns/firstbitlow-errors.hlsl b/clang/test/SemaHLSL/BuiltIns/firstbitlow-errors.hlsl
new file mode 100644
index 00000000000000..95c25e9e2fb60d
--- /dev/null
+++ b/clang/test/SemaHLSL/BuiltIns/firstbitlow-errors.hlsl
@@ -0,0 +1,26 @@
+// RUN: %clang_cc1 -finclude-default-header -triple dxil-pc-shadermodel6.6-library %s -fnative-half-type -emit-llvm-only -disable-llvm-passes -verify -verify-ignore-unexpected
+
+int test_too_few_arg() {
+ return firstbitlow();
+ // expected-error@-1 {{no matching function for call to 'firstbitlow'}}
+}
+
+int test_too_many_arg(int p0) {
+ return firstbitlow(p0, p0);
+ // expected-error@-1 {{no matching function for call to 'firstbitlow'}}
+}
+
+double test_int_builtin(double p0) {
+ return firstbitlow(p0);
+ // expected-error@-1 {{call to 'firstbitlow' is ambiguous}}
+}
+
+double2 test_int_builtin_2(double2 p0) {
+ return __builtin_hlsl_elementwise_firstbitlow(p0);
+ // expected-error@-1 {{1st argument must be a vector of integers (was 'double2' (aka 'vector<double, 2>'))}}
+}
+
+float test_int_builtin_3(float p0) {
+ return __builtin_hlsl_elementwise_firstbitlow(p0);
+ // expected-error@-1 {{1st argument must be a vector of integers (was 'double')}}
+}
diff --git a/llvm/include/llvm/IR/IntrinsicsDirectX.td b/llvm/include/llvm/IR/IntrinsicsDirectX.td
index 6093664c908dc5..6148dcaff64470 100644
--- a/llvm/include/llvm/IR/IntrinsicsDirectX.td
+++ b/llvm/include/llvm/IR/IntrinsicsDirectX.td
@@ -103,4 +103,5 @@ def int_dx_splitdouble : DefaultAttrsIntrinsic<[llvm_anyint_ty, LLVMMatchType<0>
def int_dx_radians : DefaultAttrsIntrinsic<[llvm_anyfloat_ty], [LLVMMatchType<0>], [IntrNoMem]>;
def int_dx_firstbituhigh : DefaultAttrsIntrinsic<[LLVMScalarOrSameVectorWidth<0, llvm_i32_ty>], [llvm_anyint_ty], [IntrNoMem]>;
def int_dx_firstbitshigh : DefaultAttrsIntrinsic<[LLVMScalarOrSameVectorWidth<0, llvm_i32_ty>], [llvm_anyint_ty], [IntrNoMem]>;
+def int_dx_firstbitlow : DefaultAttrsIntrinsic<[LLVMScalarOrSameVectorWidth<0, llvm_i32_ty>], [llvm_anyint_ty], [IntrNoMem]>;
}
diff --git a/llvm/include/llvm/IR/IntrinsicsSPIRV.td b/llvm/include/llvm/IR/IntrinsicsSPIRV.td
index f29eb7ee22b2d2..63e99524511143 100644
--- a/llvm/include/llvm/IR/IntrinsicsSPIRV.td
+++ b/llvm/include/llvm/IR/IntrinsicsSPIRV.td
@@ -106,6 +106,7 @@ let TargetPrefix = "spv" in {
[IntrNoMem]>;
def int_spv_firstbituhigh : DefaultAttrsIntrinsic<[LLVMScalarOrSameVectorWidth<0, llvm_i32_ty>], [llvm_anyint_ty], [IntrNoMem]>;
def int_spv_firstbitshigh : DefaultAttrsIntrinsic<[LLVMScalarOrSameVectorWidth<0, llvm_i32_ty>], [llvm_anyint_ty], [IntrNoMem]>;
+ def int_spv_firstbitlow : DefaultAttrsIntrinsic<[LLVMScalarOrSameVectorWidth<0, llvm_i32_ty>], [llvm_anyint_ty], [IntrNoMem]>;
// Read a value from the image buffer. It does not translate directly to a
// single OpImageRead because the result type is not necessarily a 4 element
diff --git a/llvm/lib/Target/DirectX/DXIL.td b/llvm/lib/Target/DirectX/DXIL.td
index 078f0591a67515..efdd91c9c27f62 100644
--- a/llvm/lib/Target/DirectX/DXIL.td
+++ b/llvm/lib/Target/DirectX/DXIL.td
@@ -564,6 +564,18 @@ def CountBits : DXILOp<31, unaryBits> {
let attributes = [Attributes<DXIL1_0, [ReadNone]>];
}
+def FirstbitLo : DXILOp<32, unaryBits> {
+ let Doc = "Returns the location of the first set bit starting from "
+ "the lowest order bit and working upward.";
+ let LLVMIntrinsic = int_dx_firstbitlow;
+ let arguments = [OverloadTy];
+ let result = Int32Ty;
+ let overloads =
+ [Overloads<DXIL1_0, [Int16Ty, Int32Ty, Int64Ty]>];
+ let stages = [Stages<DXIL1_0, [all_stages]>];
+ let attributes = [Attributes<DXIL1_0, [ReadNone]>];
+}
+
def FirstbitHi : DXILOp<33, unaryBits> {
let Doc = "Returns the location of the first set bit starting from "
"the highest order bit and working downward.";
diff --git a/llvm/lib/Target/DirectX/DirectXTargetTransformInfo.cpp b/llvm/lib/Target/DirectX/DirectXTargetTransformInfo.cpp
index b0436a39423405..c37e0dbb82e5a0 100644
--- a/llvm/lib/Target/DirectX/DirectXTargetTransformInfo.cpp
+++ b/llvm/lib/Target/DirectX/DirectXTargetTransformInfo.cpp
@@ -34,6 +34,7 @@ bool DirectXTTIImpl::isTargetIntrinsicTriviallyScalarizable(
case Intrinsic::dx_splitdouble:
case Intrinsic::dx_firstbituhigh:
case Intrinsic::dx_firstbitshigh:
+ case Intrinsic::dx_firstbitlow:
return true;
default:
return false;
diff --git a/llvm/lib/Target/SPIRV/SPIRVInstructionSelector.cpp b/llvm/lib/Target/SPIRV/SPIRVInstructionSelector.cpp
index 8a8835e0269200..c033f04305d0f0 100644
--- a/llvm/lib/Target/SPIRV/SPIRVInstructionSelector.cpp
+++ b/llvm/lib/Target/SPIRV/SPIRVInstructionSelector.cpp
@@ -106,6 +106,18 @@ class SPIRVInstructionSelector : public InstructionSelector {
bool selectFirstBitHigh64(Register ResVReg, const SPIRVType *ResType,
MachineInstr &I, bool IsSigned) const;
+ bool selectFirstBitLow(Register ResVReg, const SPIRVType *ResType,
+ MachineInstr &I) const;
+
+ bool selectFirstBitLow16(Register ResVReg, const SPIRVType *ResType,
+ MachineInstr &I) const;
+
+ bool selectFirstBitLow32(Register ResVReg, const SPIRVType *ResType,
+ MachineInstr &I, Register SrcReg) const;
+
+ bool selectFirstBitLow64(Register ResVReg, const SPIRVType *ResType,
+ MachineInstr &I) const;
+
bool selectGlobalValue(Register ResVReg, MachineInstr &I,
const MachineInstr *Init = nullptr) const;
@@ -2786,6 +2798,8 @@ bool SPIRVInstructionSelector::selectIntrinsic(Register ResVReg,
return selectFirstBitHigh(ResVReg, ResType, I, /*IsSigned=*/false);
case Intrinsic::spv_firstbitshigh: // There is no CL equivalent of FindSMsb
return selectFirstBitHigh(ResVReg, ResType, I, /*IsSigned=*/true);
+ case Intrinsic::spv_firstbitlow: // There is no CL equivlent of FindILsb
+ return selectFirstBitLow(ResVReg, ResType, I);
case Intrinsic::spv_group_memory_barrier_with_group_sync: {
bool Result = true;
auto MemSemConstant =
@@ -3158,6 +3172,166 @@ bool SPIRVInstructionSelector::selectFirstBitHigh(Register ResVReg,
}
}
+bool SPIRVInstructionSelector::selectFirstBitLow16(Register ResVReg,
+ const SPIRVType *ResType,
+ MachineInstr &I) const {
+ // OpUConvert treats the operand bits as an unsigned i16 and zero extends it
+ // to an unsigned i32. As this leaves all the least significant bits unchanged
+ // the first set bit from the LSB side doesn't change.
+ Register ExtReg = MRI->createVirtualRegister(GR.getRegClass(ResType));
+ bool Result = selectNAryOpWithSrcs(
+ ExtReg, ResType, I, {I.getOperand(2).getReg()}, SPIRV::OpUConvert);
+ return Result && selectFirstBitLow32(ResVReg, ResType, I, ExtReg);
+}
+
+bool SPIRVInstructionSelector::selectFirstBitLow32(Register ResVReg,
+ const SPIRVType *ResType,
+ MachineInstr &I,
+ Register SrcReg) const {
+ return BuildMI(*I.getParent(), I, I.getDebugLoc(), TII.get(SPIRV::OpExtInst))
+ .addDef(ResVReg)
+ .addUse(GR.getSPIRVTypeID(ResType))
+ .addImm(static_cast<uint32_t>(SPIRV::InstructionSet::GLSL_std_450))
+ .addImm(GL::FindILsb)
+ .addUse(SrcReg)
+ .constrainAllUses(TII, TRI, RBI);
+}
+
+bool SPIRVInstructionSelector::selectFirstBitLow64(Register ResVReg,
+ const SPIRVType *ResType,
+ MachineInstr &I) const {
+ Register OpReg = I.getOperand(2).getReg();
+
+ // 1. Split int64 into 2 pieces using a bitcast
+ unsigned ComponentCount = GR.getScalarOrVectorComponentCount(ResType);
+ SPIRVType *BaseType = GR.retrieveScalarOrVectorIntType(ResType);
...
[truncated]
|
@llvm/pr-subscribers-hlsl Author: Ashley Coleman (V-FEXrt) ChangesCloses #99116
Patch is 33.49 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/116858.diff 16 Files Affected:
diff --git a/clang/include/clang/Basic/Builtins.td b/clang/include/clang/Basic/Builtins.td
index e866605ac05c09..afe0a311f236d8 100644
--- a/clang/include/clang/Basic/Builtins.td
+++ b/clang/include/clang/Basic/Builtins.td
@@ -4810,6 +4810,12 @@ def HLSLFirstBitHigh : LangBuiltin<"HLSL_LANG"> {
let Prototype = "void(...)";
}
+def HLSLFirstBitLow : LangBuiltin<"HLSL_LANG"> {
+ let Spellings = ["__builtin_hlsl_elementwise_firstbitlow"];
+ let Attributes = [NoThrow, Const];
+ let Prototype = "void(...)";
+}
+
def HLSLFrac : LangBuiltin<"HLSL_LANG"> {
let Spellings = ["__builtin_hlsl_elementwise_frac"];
let Attributes = [NoThrow, Const];
diff --git a/clang/lib/CodeGen/CGBuiltin.cpp b/clang/lib/CodeGen/CGBuiltin.cpp
index 9e0c0bff0125c0..a65b96684f18eb 100644
--- a/clang/lib/CodeGen/CGBuiltin.cpp
+++ b/clang/lib/CodeGen/CGBuiltin.cpp
@@ -18956,7 +18956,6 @@ Value *CodeGenFunction::EmitHLSLBuiltinExpr(unsigned BuiltinID,
"hlsl.dot4add.u8packed");
}
case Builtin::BI__builtin_hlsl_elementwise_firstbithigh: {
-
Value *X = EmitScalarExpr(E->getArg(0));
return Builder.CreateIntrinsic(
@@ -18964,6 +18963,14 @@ Value *CodeGenFunction::EmitHLSLBuiltinExpr(unsigned BuiltinID,
getFirstBitHighIntrinsic(CGM.getHLSLRuntime(), E->getArg(0)->getType()),
ArrayRef<Value *>{X}, nullptr, "hlsl.firstbithigh");
}
+ case Builtin::BI__builtin_hlsl_elementwise_firstbitlow: {
+ Value *X = EmitScalarExpr(E->getArg(0));
+
+ return Builder.CreateIntrinsic(
+ /*ReturnType=*/ConvertType(E->getType()),
+ CGM.getHLSLRuntime().getFirstBitLowIntrinsic(), ArrayRef<Value *>{X},
+ nullptr, "hlsl.firstbitlow");
+ }
case Builtin::BI__builtin_hlsl_lerp: {
Value *X = EmitScalarExpr(E->getArg(0));
Value *Y = EmitScalarExpr(E->getArg(1));
diff --git a/clang/lib/CodeGen/CGHLSLRuntime.h b/clang/lib/CodeGen/CGHLSLRuntime.h
index 381a5959ec098e..c9eb1b08ff6ba6 100644
--- a/clang/lib/CodeGen/CGHLSLRuntime.h
+++ b/clang/lib/CodeGen/CGHLSLRuntime.h
@@ -96,6 +96,7 @@ class CGHLSLRuntime {
GENERATE_HLSL_INTRINSIC_FUNCTION(WaveReadLaneAt, wave_readlane)
GENERATE_HLSL_INTRINSIC_FUNCTION(FirstBitUHigh, firstbituhigh)
GENERATE_HLSL_INTRINSIC_FUNCTION(FirstBitSHigh, firstbitshigh)
+ GENERATE_HLSL_INTRINSIC_FUNCTION(FirstBitLow, firstbitlow)
GENERATE_HLSL_INTRINSIC_FUNCTION(NClamp, nclamp)
GENERATE_HLSL_INTRINSIC_FUNCTION(SClamp, sclamp)
GENERATE_HLSL_INTRINSIC_FUNCTION(UClamp, uclamp)
diff --git a/clang/lib/Headers/hlsl/hlsl_intrinsics.h b/clang/lib/Headers/hlsl/hlsl_intrinsics.h
index 2ee3827d720495..610ddcea203d90 100644
--- a/clang/lib/Headers/hlsl/hlsl_intrinsics.h
+++ b/clang/lib/Headers/hlsl/hlsl_intrinsics.h
@@ -1086,6 +1086,78 @@ uint3 firstbithigh(uint64_t3);
_HLSL_BUILTIN_ALIAS(__builtin_hlsl_elementwise_firstbithigh)
uint4 firstbithigh(uint64_t4);
+//===----------------------------------------------------------------------===//
+// firstbitlow builtins
+//===----------------------------------------------------------------------===//
+
+/// \fn T firstbitlow(T Val)
+/// \brief Returns the location of the first set bit starting from the lowest
+/// order bit and working upward, per component.
+/// \param Val the input value.
+
+#ifdef __HLSL_ENABLE_16_BIT
+_HLSL_AVAILABILITY(shadermodel, 6.2)
+_HLSL_BUILTIN_ALIAS(__builtin_hlsl_elementwise_firstbitlow)
+uint firstbitlow(int16_t);
+_HLSL_AVAILABILITY(shadermodel, 6.2)
+_HLSL_BUILTIN_ALIAS(__builtin_hlsl_elementwise_firstbitlow)
+uint2 firstbitlow(int16_t2);
+_HLSL_AVAILABILITY(shadermodel, 6.2)
+_HLSL_BUILTIN_ALIAS(__builtin_hlsl_elementwise_firstbitlow)
+uint3 firstbitlow(int16_t3);
+_HLSL_AVAILABILITY(shadermodel, 6.2)
+_HLSL_BUILTIN_ALIAS(__builtin_hlsl_elementwise_firstbitlow)
+uint4 firstbitlow(int16_t4);
+_HLSL_AVAILABILITY(shadermodel, 6.2)
+_HLSL_BUILTIN_ALIAS(__builtin_hlsl_elementwise_firstbitlow)
+uint firstbitlow(uint16_t);
+_HLSL_AVAILABILITY(shadermodel, 6.2)
+_HLSL_BUILTIN_ALIAS(__builtin_hlsl_elementwise_firstbitlow)
+uint2 firstbitlow(uint16_t2);
+_HLSL_AVAILABILITY(shadermodel, 6.2)
+_HLSL_BUILTIN_ALIAS(__builtin_hlsl_elementwise_firstbitlow)
+uint3 firstbitlow(uint16_t3);
+_HLSL_AVAILABILITY(shadermodel, 6.2)
+_HLSL_BUILTIN_ALIAS(__builtin_hlsl_elementwise_firstbitlow)
+uint4 firstbitlow(uint16_t4);
+#endif
+
+_HLSL_BUILTIN_ALIAS(__builtin_hlsl_elementwise_firstbitlow)
+uint firstbitlow(int);
+_HLSL_BUILTIN_ALIAS(__builtin_hlsl_elementwise_firstbitlow)
+uint2 firstbitlow(int2);
+_HLSL_BUILTIN_ALIAS(__builtin_hlsl_elementwise_firstbitlow)
+uint3 firstbitlow(int3);
+_HLSL_BUILTIN_ALIAS(__builtin_hlsl_elementwise_firstbitlow)
+uint4 firstbitlow(int4);
+
+_HLSL_BUILTIN_ALIAS(__builtin_hlsl_elementwise_firstbitlow)
+uint firstbitlow(uint);
+_HLSL_BUILTIN_ALIAS(__builtin_hlsl_elementwise_firstbitlow)
+uint2 firstbitlow(uint2);
+_HLSL_BUILTIN_ALIAS(__builtin_hlsl_elementwise_firstbitlow)
+uint3 firstbitlow(uint3);
+_HLSL_BUILTIN_ALIAS(__builtin_hlsl_elementwise_firstbitlow)
+uint4 firstbitlow(uint4);
+
+_HLSL_BUILTIN_ALIAS(__builtin_hlsl_elementwise_firstbitlow)
+uint firstbitlow(int64_t);
+_HLSL_BUILTIN_ALIAS(__builtin_hlsl_elementwise_firstbitlow)
+uint2 firstbitlow(int64_t2);
+_HLSL_BUILTIN_ALIAS(__builtin_hlsl_elementwise_firstbitlow)
+uint3 firstbitlow(int64_t3);
+_HLSL_BUILTIN_ALIAS(__builtin_hlsl_elementwise_firstbitlow)
+uint4 firstbitlow(int64_t4);
+
+_HLSL_BUILTIN_ALIAS(__builtin_hlsl_elementwise_firstbitlow)
+uint firstbitlow(uint64_t);
+_HLSL_BUILTIN_ALIAS(__builtin_hlsl_elementwise_firstbitlow)
+uint2 firstbitlow(uint64_t2);
+_HLSL_BUILTIN_ALIAS(__builtin_hlsl_elementwise_firstbitlow)
+uint3 firstbitlow(uint64_t3);
+_HLSL_BUILTIN_ALIAS(__builtin_hlsl_elementwise_firstbitlow)
+uint4 firstbitlow(uint64_t4);
+
//===----------------------------------------------------------------------===//
// floor builtins
//===----------------------------------------------------------------------===//
diff --git a/clang/lib/Sema/SemaHLSL.cpp b/clang/lib/Sema/SemaHLSL.cpp
index 65b0d9cd65637f..c90ba9ae9e0f84 100644
--- a/clang/lib/Sema/SemaHLSL.cpp
+++ b/clang/lib/Sema/SemaHLSL.cpp
@@ -1947,7 +1947,8 @@ bool SemaHLSL::CheckBuiltinFunctionCall(unsigned BuiltinID, CallExpr *TheCall) {
return true;
break;
}
- case Builtin::BI__builtin_hlsl_elementwise_firstbithigh: {
+ case Builtin::BI__builtin_hlsl_elementwise_firstbithigh:
+ case Builtin::BI__builtin_hlsl_elementwise_firstbitlow: {
if (SemaRef.PrepareBuiltinElementwiseMathOneArgCall(TheCall))
return true;
diff --git a/clang/test/CodeGenHLSL/builtins/firstbitlow.hlsl b/clang/test/CodeGenHLSL/builtins/firstbitlow.hlsl
new file mode 100644
index 00000000000000..5d490fabc5bc8d
--- /dev/null
+++ b/clang/test/CodeGenHLSL/builtins/firstbitlow.hlsl
@@ -0,0 +1,153 @@
+// RUN: %clang_cc1 -finclude-default-header -x hlsl -triple \
+// RUN: dxil-pc-shadermodel6.3-library %s -fnative-half-type \
+// RUN: -emit-llvm -disable-llvm-passes -o - | FileCheck %s -DTARGET=dx
+// RUN: %clang_cc1 -finclude-default-header -x hlsl -triple \
+// RUN: spirv-unknown-vulkan-compute %s -fnative-half-type \
+// RUN: -emit-llvm -disable-llvm-passes \
+// RUN: -o - | FileCheck %s -DTARGET=spv
+
+#ifdef __HLSL_ENABLE_16_BIT
+// CHECK-LABEL: test_firstbitlow_ushort
+// CHECK: call i32 @llvm.[[TARGET]].firstbitlow.i16
+uint test_firstbitlow_ushort(uint16_t p0) {
+ return firstbitlow(p0);
+}
+
+// CHECK-LABEL: test_firstbitlow_ushort2
+// CHECK: call <2 x i32> @llvm.[[TARGET]].firstbitlow.v2i16
+uint2 test_firstbitlow_ushort2(uint16_t2 p0) {
+ return firstbitlow(p0);
+}
+
+// CHECK-LABEL: test_firstbitlow_ushort3
+// CHECK: call <3 x i32> @llvm.[[TARGET]].firstbitlow.v3i16
+uint3 test_firstbitlow_ushort3(uint16_t3 p0) {
+ return firstbitlow(p0);
+}
+
+// CHECK-LABEL: test_firstbitlow_ushort4
+// CHECK: call <4 x i32> @llvm.[[TARGET]].firstbitlow.v4i16
+uint4 test_firstbitlow_ushort4(uint16_t4 p0) {
+ return firstbitlow(p0);
+}
+
+// CHECK-LABEL: test_firstbitlow_short
+// CHECK: call i32 @llvm.[[TARGET]].firstbitlow.i16
+uint test_firstbitlow_short(int16_t p0) {
+ return firstbitlow(p0);
+}
+
+// CHECK-LABEL: test_firstbitlow_short2
+// CHECK: call <2 x i32> @llvm.[[TARGET]].firstbitlow.v2i16
+uint2 test_firstbitlow_short2(int16_t2 p0) {
+ return firstbitlow(p0);
+}
+
+// CHECK-LABEL: test_firstbitlow_short3
+// CHECK: call <3 x i32> @llvm.[[TARGET]].firstbitlow.v3i16
+uint3 test_firstbitlow_short3(int16_t3 p0) {
+ return firstbitlow(p0);
+}
+
+// CHECK-LABEL: test_firstbitlow_short4
+// CHECK: call <4 x i32> @llvm.[[TARGET]].firstbitlow.v4i16
+uint4 test_firstbitlow_short4(int16_t4 p0) {
+ return firstbitlow(p0);
+}
+#endif // __HLSL_ENABLE_16_BIT
+
+// CHECK-LABEL: test_firstbitlow_uint
+// CHECK: call i32 @llvm.[[TARGET]].firstbitlow.i32
+uint test_firstbitlow_uint(uint p0) {
+ return firstbitlow(p0);
+}
+
+// CHECK-LABEL: test_firstbitlow_uint2
+// CHECK: call <2 x i32> @llvm.[[TARGET]].firstbitlow.v2i32
+uint2 test_firstbitlow_uint2(uint2 p0) {
+ return firstbitlow(p0);
+}
+
+// CHECK-LABEL: test_firstbitlow_uint3
+// CHECK: call <3 x i32> @llvm.[[TARGET]].firstbitlow.v3i32
+uint3 test_firstbitlow_uint3(uint3 p0) {
+ return firstbitlow(p0);
+}
+
+// CHECK-LABEL: test_firstbitlow_uint4
+// CHECK: call <4 x i32> @llvm.[[TARGET]].firstbitlow.v4i32
+uint4 test_firstbitlow_uint4(uint4 p0) {
+ return firstbitlow(p0);
+}
+
+// CHECK-LABEL: test_firstbitlow_ulong
+// CHECK: call i32 @llvm.[[TARGET]].firstbitlow.i64
+uint test_firstbitlow_ulong(uint64_t p0) {
+ return firstbitlow(p0);
+}
+
+// CHECK-LABEL: test_firstbitlow_ulong2
+// CHECK: call <2 x i32> @llvm.[[TARGET]].firstbitlow.v2i64
+uint2 test_firstbitlow_ulong2(uint64_t2 p0) {
+ return firstbitlow(p0);
+}
+
+// CHECK-LABEL: test_firstbitlow_ulong3
+// CHECK: call <3 x i32> @llvm.[[TARGET]].firstbitlow.v3i64
+uint3 test_firstbitlow_ulong3(uint64_t3 p0) {
+ return firstbitlow(p0);
+}
+
+// CHECK-LABEL: test_firstbitlow_ulong4
+// CHECK: call <4 x i32> @llvm.[[TARGET]].firstbitlow.v4i64
+uint4 test_firstbitlow_ulong4(uint64_t4 p0) {
+ return firstbitlow(p0);
+}
+
+// CHECK-LABEL: test_firstbitlow_int
+// CHECK: call i32 @llvm.[[TARGET]].firstbitlow.i32
+uint test_firstbitlow_int(int p0) {
+ return firstbitlow(p0);
+}
+
+// CHECK-LABEL: test_firstbitlow_int2
+// CHECK: call <2 x i32> @llvm.[[TARGET]].firstbitlow.v2i32
+uint2 test_firstbitlow_int2(int2 p0) {
+ return firstbitlow(p0);
+}
+
+// CHECK-LABEL: test_firstbitlow_int3
+// CHECK: call <3 x i32> @llvm.[[TARGET]].firstbitlow.v3i32
+uint3 test_firstbitlow_int3(int3 p0) {
+ return firstbitlow(p0);
+}
+
+// CHECK-LABEL: test_firstbitlow_int4
+// CHECK: call <4 x i32> @llvm.[[TARGET]].firstbitlow.v4i32
+uint4 test_firstbitlow_int4(int4 p0) {
+ return firstbitlow(p0);
+}
+
+// CHECK-LABEL: test_firstbitlow_long
+// CHECK: call i32 @llvm.[[TARGET]].firstbitlow.i64
+uint test_firstbitlow_long(int64_t p0) {
+ return firstbitlow(p0);
+}
+
+// CHECK-LABEL: test_firstbitlow_long2
+// CHECK: call <2 x i32> @llvm.[[TARGET]].firstbitlow.v2i64
+uint2 test_firstbitlow_long2(int64_t2 p0) {
+ return firstbitlow(p0);
+}
+
+// CHECK-LABEL: test_firstbitlow_long3
+// CHECK: call <3 x i32> @llvm.[[TARGET]].firstbitlow.v3i64
+uint3 test_firstbitlow_long3(int64_t3 p0) {
+ return firstbitlow(p0);
+}
+
+// CHECK-LABEL: test_firstbitlow_long4
+// CHECK: call <4 x i32> @llvm.[[TARGET]].firstbitlow.v4i64
+uint4 test_firstbitlow_long4(int64_t4 p0) {
+ return firstbitlow(p0);
+}
diff --git a/clang/test/SemaHLSL/BuiltIns/firstbithigh-errors.hlsl b/clang/test/SemaHLSL/BuiltIns/firstbithigh-errors.hlsl
index 1912ab3ae806b3..b4024418dbba4f 100644
--- a/clang/test/SemaHLSL/BuiltIns/firstbithigh-errors.hlsl
+++ b/clang/test/SemaHLSL/BuiltIns/firstbithigh-errors.hlsl
@@ -17,12 +17,10 @@ double test_int_builtin(double p0) {
double2 test_int_builtin_2(double2 p0) {
return __builtin_hlsl_elementwise_firstbithigh(p0);
- // expected-error@-1 {{1st argument must be a vector of integers
- // (was 'double2' (aka 'vector<double, 2>'))}}
+ // expected-error@-1 {{1st argument must be a vector of integers (was 'double2' (aka 'vector<double, 2>'))}}
}
float test_int_builtin_3(float p0) {
return __builtin_hlsl_elementwise_firstbithigh(p0);
- // expected-error@-1 {{1st argument must be a vector of integers
- // (was 'float')}}
+ // expected-error@-1 {{1st argument must be a vector of integers (was 'double')}}
}
diff --git a/clang/test/SemaHLSL/BuiltIns/firstbitlow-errors.hlsl b/clang/test/SemaHLSL/BuiltIns/firstbitlow-errors.hlsl
new file mode 100644
index 00000000000000..95c25e9e2fb60d
--- /dev/null
+++ b/clang/test/SemaHLSL/BuiltIns/firstbitlow-errors.hlsl
@@ -0,0 +1,26 @@
+// RUN: %clang_cc1 -finclude-default-header -triple dxil-pc-shadermodel6.6-library %s -fnative-half-type -emit-llvm-only -disable-llvm-passes -verify -verify-ignore-unexpected
+
+int test_too_few_arg() {
+ return firstbitlow();
+ // expected-error@-1 {{no matching function for call to 'firstbitlow'}}
+}
+
+int test_too_many_arg(int p0) {
+ return firstbitlow(p0, p0);
+ // expected-error@-1 {{no matching function for call to 'firstbitlow'}}
+}
+
+double test_int_builtin(double p0) {
+ return firstbitlow(p0);
+ // expected-error@-1 {{call to 'firstbitlow' is ambiguous}}
+}
+
+double2 test_int_builtin_2(double2 p0) {
+ return __builtin_hlsl_elementwise_firstbitlow(p0);
+ // expected-error@-1 {{1st argument must be a vector of integers (was 'double2' (aka 'vector<double, 2>'))}}
+}
+
+float test_int_builtin_3(float p0) {
+ return __builtin_hlsl_elementwise_firstbitlow(p0);
+ // expected-error@-1 {{1st argument must be a vector of integers (was 'double')}}
+}
diff --git a/llvm/include/llvm/IR/IntrinsicsDirectX.td b/llvm/include/llvm/IR/IntrinsicsDirectX.td
index 6093664c908dc5..6148dcaff64470 100644
--- a/llvm/include/llvm/IR/IntrinsicsDirectX.td
+++ b/llvm/include/llvm/IR/IntrinsicsDirectX.td
@@ -103,4 +103,5 @@ def int_dx_splitdouble : DefaultAttrsIntrinsic<[llvm_anyint_ty, LLVMMatchType<0>
def int_dx_radians : DefaultAttrsIntrinsic<[llvm_anyfloat_ty], [LLVMMatchType<0>], [IntrNoMem]>;
def int_dx_firstbituhigh : DefaultAttrsIntrinsic<[LLVMScalarOrSameVectorWidth<0, llvm_i32_ty>], [llvm_anyint_ty], [IntrNoMem]>;
def int_dx_firstbitshigh : DefaultAttrsIntrinsic<[LLVMScalarOrSameVectorWidth<0, llvm_i32_ty>], [llvm_anyint_ty], [IntrNoMem]>;
+def int_dx_firstbitlow : DefaultAttrsIntrinsic<[LLVMScalarOrSameVectorWidth<0, llvm_i32_ty>], [llvm_anyint_ty], [IntrNoMem]>;
}
diff --git a/llvm/include/llvm/IR/IntrinsicsSPIRV.td b/llvm/include/llvm/IR/IntrinsicsSPIRV.td
index f29eb7ee22b2d2..63e99524511143 100644
--- a/llvm/include/llvm/IR/IntrinsicsSPIRV.td
+++ b/llvm/include/llvm/IR/IntrinsicsSPIRV.td
@@ -106,6 +106,7 @@ let TargetPrefix = "spv" in {
[IntrNoMem]>;
def int_spv_firstbituhigh : DefaultAttrsIntrinsic<[LLVMScalarOrSameVectorWidth<0, llvm_i32_ty>], [llvm_anyint_ty], [IntrNoMem]>;
def int_spv_firstbitshigh : DefaultAttrsIntrinsic<[LLVMScalarOrSameVectorWidth<0, llvm_i32_ty>], [llvm_anyint_ty], [IntrNoMem]>;
+ def int_spv_firstbitlow : DefaultAttrsIntrinsic<[LLVMScalarOrSameVectorWidth<0, llvm_i32_ty>], [llvm_anyint_ty], [IntrNoMem]>;
// Read a value from the image buffer. It does not translate directly to a
// single OpImageRead because the result type is not necessarily a 4 element
diff --git a/llvm/lib/Target/DirectX/DXIL.td b/llvm/lib/Target/DirectX/DXIL.td
index 078f0591a67515..efdd91c9c27f62 100644
--- a/llvm/lib/Target/DirectX/DXIL.td
+++ b/llvm/lib/Target/DirectX/DXIL.td
@@ -564,6 +564,18 @@ def CountBits : DXILOp<31, unaryBits> {
let attributes = [Attributes<DXIL1_0, [ReadNone]>];
}
+def FirstbitLo : DXILOp<32, unaryBits> {
+ let Doc = "Returns the location of the first set bit starting from "
+ "the lowest order bit and working upward.";
+ let LLVMIntrinsic = int_dx_firstbitlow;
+ let arguments = [OverloadTy];
+ let result = Int32Ty;
+ let overloads =
+ [Overloads<DXIL1_0, [Int16Ty, Int32Ty, Int64Ty]>];
+ let stages = [Stages<DXIL1_0, [all_stages]>];
+ let attributes = [Attributes<DXIL1_0, [ReadNone]>];
+}
+
def FirstbitHi : DXILOp<33, unaryBits> {
let Doc = "Returns the location of the first set bit starting from "
"the highest order bit and working downward.";
diff --git a/llvm/lib/Target/DirectX/DirectXTargetTransformInfo.cpp b/llvm/lib/Target/DirectX/DirectXTargetTransformInfo.cpp
index b0436a39423405..c37e0dbb82e5a0 100644
--- a/llvm/lib/Target/DirectX/DirectXTargetTransformInfo.cpp
+++ b/llvm/lib/Target/DirectX/DirectXTargetTransformInfo.cpp
@@ -34,6 +34,7 @@ bool DirectXTTIImpl::isTargetIntrinsicTriviallyScalarizable(
case Intrinsic::dx_splitdouble:
case Intrinsic::dx_firstbituhigh:
case Intrinsic::dx_firstbitshigh:
+ case Intrinsic::dx_firstbitlow:
return true;
default:
return false;
diff --git a/llvm/lib/Target/SPIRV/SPIRVInstructionSelector.cpp b/llvm/lib/Target/SPIRV/SPIRVInstructionSelector.cpp
index 8a8835e0269200..c033f04305d0f0 100644
--- a/llvm/lib/Target/SPIRV/SPIRVInstructionSelector.cpp
+++ b/llvm/lib/Target/SPIRV/SPIRVInstructionSelector.cpp
@@ -106,6 +106,18 @@ class SPIRVInstructionSelector : public InstructionSelector {
bool selectFirstBitHigh64(Register ResVReg, const SPIRVType *ResType,
MachineInstr &I, bool IsSigned) const;
+ bool selectFirstBitLow(Register ResVReg, const SPIRVType *ResType,
+ MachineInstr &I) const;
+
+ bool selectFirstBitLow16(Register ResVReg, const SPIRVType *ResType,
+ MachineInstr &I) const;
+
+ bool selectFirstBitLow32(Register ResVReg, const SPIRVType *ResType,
+ MachineInstr &I, Register SrcReg) const;
+
+ bool selectFirstBitLow64(Register ResVReg, const SPIRVType *ResType,
+ MachineInstr &I) const;
+
bool selectGlobalValue(Register ResVReg, MachineInstr &I,
const MachineInstr *Init = nullptr) const;
@@ -2786,6 +2798,8 @@ bool SPIRVInstructionSelector::selectIntrinsic(Register ResVReg,
return selectFirstBitHigh(ResVReg, ResType, I, /*IsSigned=*/false);
case Intrinsic::spv_firstbitshigh: // There is no CL equivalent of FindSMsb
return selectFirstBitHigh(ResVReg, ResType, I, /*IsSigned=*/true);
+ case Intrinsic::spv_firstbitlow: // There is no CL equivlent of FindILsb
+ return selectFirstBitLow(ResVReg, ResType, I);
case Intrinsic::spv_group_memory_barrier_with_group_sync: {
bool Result = true;
auto MemSemConstant =
@@ -3158,6 +3172,166 @@ bool SPIRVInstructionSelector::selectFirstBitHigh(Register ResVReg,
}
}
+bool SPIRVInstructionSelector::selectFirstBitLow16(Register ResVReg,
+ const SPIRVType *ResType,
+ MachineInstr &I) const {
+ // OpUConvert treats the operand bits as an unsigned i16 and zero extends it
+ // to an unsigned i32. As this leaves all the least significant bits unchanged
+ // the first set bit from the LSB side doesn't change.
+ Register ExtReg = MRI->createVirtualRegister(GR.getRegClass(ResType));
+ bool Result = selectNAryOpWithSrcs(
+ ExtReg, ResType, I, {I.getOperand(2).getReg()}, SPIRV::OpUConvert);
+ return Result && selectFirstBitLow32(ResVReg, ResType, I, ExtReg);
+}
+
+bool SPIRVInstructionSelector::selectFirstBitLow32(Register ResVReg,
+ const SPIRVType *ResType,
+ MachineInstr &I,
+ Register SrcReg) const {
+ return BuildMI(*I.getParent(), I, I.getDebugLoc(), TII.get(SPIRV::OpExtInst))
+ .addDef(ResVReg)
+ .addUse(GR.getSPIRVTypeID(ResType))
+ .addImm(static_cast<uint32_t>(SPIRV::InstructionSet::GLSL_std_450))
+ .addImm(GL::FindILsb)
+ .addUse(SrcReg)
+ .constrainAllUses(TII, TRI, RBI);
+}
+
+bool SPIRVInstructionSelector::selectFirstBitLow64(Register ResVReg,
+ const SPIRVType *ResType,
+ MachineInstr &I) const {
+ Register OpReg = I.getOperand(2).getReg();
+
+ // 1. Split int64 into 2 pieces using a bitcast
+ unsigned ComponentCount = GR.getScalarOrVectorComponentCount(ResType);
+ SPIRVType *BaseType = GR.retrieveScalarOrVectorIntType(ResType);
...
[truncated]
|
} | ||
|
||
float test_int_builtin_3(float p0) { | ||
return __builtin_hlsl_elementwise_firstbithigh(p0); | ||
// expected-error@-1 {{1st argument must be a vector of integers | ||
// (was 'float')}} | ||
// expected-error@-1 {{1st argument must be a vector of integers (was 'double')}} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test framework doesn't actually assert these lines when split. I had to unsplit them which raised a test failure that was then fixed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know you can do a \
continuation for multiple expected directives, but I don't know if you can do that inside the match string. For example I know this works:
<some_bad_code> // expected-error {{ first error }} \
// expected-error {{ second error }}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I hoping we can make this simpler than the Msb builtins. I'll double check that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is only potential problem with the vector sizes. Sorry I did not catch that on the previous PR. I would also like to see the code refactored a little to reduce the amount of repeated code. I think each step except for 2 in selectFirstBit*64
could factored into a function call. The difference between the High and Low is a small change in the parameters to those function calls.
417133d
to
27707e3
Compare
1b37328
to
a90026c
Compare
; CHECK: [[right_res:%.+]] = OpIAdd [[u32x2_t]] [[right_ans_offset]] [[right_ans_bits]] | ||
|
||
; Merge the resulting i32, i32x2 into the final i32x3 and return it | ||
; CHECK: [[ret:%.+]] = OpCompositeConstruct [[u32x3_t]] [[left_res]] [[right_res]] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I'm allowed to use OpCompositeConstruct
this way but would be good if someone could verify
Usages
u32x3 res = OpCompositeConstruct (left: u32) (right: u32x2)
u32x4 res = OpCompositeConstruct (left: u32x2) (right: u32x2)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, you can do that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally looks good to me. Thanks for fixing up the vector size issue. I have a couple minor issue, but I'm fine with this.
The spir-v test failures are unrelated to your change. It is caused by a problem in spirv-val, and it has been fixed.Don't let that stop you from merging. |
Yeah I figured they weren't broken by me but I had assumed I needed all green before merging so was waiting for the fix to make it into main but happy to merge as is if that's acceptable |
@V-FEXrt Congratulations on having your first Pull Request (PR) merged into the LLVM Project! Your changes will be combined with recent changes from other authors, then tested by our build bots. If there is a problem with a build, you may receive a report in an email or a comment on this PR. Please check whether problems have been caused by your change specifically, as the builds can include changes from many authors. It is not uncommon for your change to be included in a build that fails due to someone else's changes, or infrastructure issues. How to do this, and the rest of the post-merge process, is covered in detail here. If your change does cause a problem, it may be reverted, or you can revert it yourself. This is a normal part of LLVM development. You can fix your changes and open a new PR to merge them again. If you don't get any reports, no action is required from you. Your changes are working as expected, well done! |
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/73/builds/11787 Here is the relevant piece of the build log for the reference
|
Closes #99116
This PR extracts common functionality from
firstbithigh
into a shared function while also fixing a bug for an edge case whereu64x3
and larger vectors will attempt to create vectors larger than the SPRIV max of 4. Because of the shared codefirstbithigh
also picks up this fix.