-
Notifications
You must be signed in to change notification settings - Fork 13.4k
[HLSL] Implement the dst HLSL Function #133828
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@llvm/pr-subscribers-hlsl @llvm/pr-subscribers-backend-x86 Author: None (metkarpoonam) ChangesImplement dst algorithm in the hlsl_intrinsics.h and added test cases for HLSL codegen and sema Full diff: https://github.com/llvm/llvm-project/pull/133828.diff 4 Files Affected:
diff --git a/clang/lib/Headers/hlsl/hlsl_intrinsic_helpers.h b/clang/lib/Headers/hlsl/hlsl_intrinsic_helpers.h
index 8cdd63d7e07bb..5ea8faf169380 100644
--- a/clang/lib/Headers/hlsl/hlsl_intrinsic_helpers.h
+++ b/clang/lib/Headers/hlsl/hlsl_intrinsic_helpers.h
@@ -35,6 +35,12 @@ length_vec_impl(vector<T, N> X) {
#endif
}
+template <typename T>
+constexpr vector<T, 4> dst_impl(vector<T, 4> src0, vector<T, 4> src1) {
+ vector<T, 4> dest = {1, src0[1] * src1[1], src0[2], src1[3]};
+ return dest;
+}
+
template <typename T> constexpr T distance_impl(T X, T Y) {
return length_impl(X - Y);
}
diff --git a/clang/lib/Headers/hlsl/hlsl_intrinsics.h b/clang/lib/Headers/hlsl/hlsl_intrinsics.h
index fd799b8d874ae..7ae94731234f9 100644
--- a/clang/lib/Headers/hlsl/hlsl_intrinsics.h
+++ b/clang/lib/Headers/hlsl/hlsl_intrinsics.h
@@ -174,6 +174,31 @@ const inline float distance(__detail::HLSL_FIXED_VECTOR<float, N> X,
__detail::HLSL_FIXED_VECTOR<float, N> Y) {
return __detail::distance_vec_impl(X, Y);
}
+//===----------------------------------------------------------------------===//
+// dst builtins
+//===----------------------------------------------------------------------===//
+
+/// \fn fvector dst( fvector, fvector)
+/// \brief Returns the length of a vector
+/// \param src0 [in] The first vector contain {_, d*d, d*d, _}
+/// \param src1 [in] The second vector contain {_, 1/d, _, 1/d}
+///
+/// Return the computed distance vector contain {1, d, d*d, 1/d}
+
+_HLSL_16BIT_AVAILABILITY(shadermodel, 6.2)
+const inline vector<half, 4> dst(vector<half, 4> src0, vector<half, 4> src1) {
+ return __detail::dst_impl(src0, src1);
+}
+
+const inline vector<float, 4> dst(vector<float, 4> src0,
+ vector<float, 4> src1) {
+ return __detail::dst_impl(src0, src1);
+}
+
+const inline vector<double, 4> dst(vector<double, 4> src0,
+ vector<double, 4> src1) {
+ return __detail::dst_impl(src0, src1);
+}
//===----------------------------------------------------------------------===//
// fmod builtins
diff --git a/clang/test/CodeGenHLSL/builtins/dst.hlsl b/clang/test/CodeGenHLSL/builtins/dst.hlsl
new file mode 100644
index 0000000000000..c62c9be5b0c1d
--- /dev/null
+++ b/clang/test/CodeGenHLSL/builtins/dst.hlsl
@@ -0,0 +1,48 @@
+// RUN: %clang_cc1 -finclude-default-header -x hlsl -triple dxil-pc-shadermodel6.2-library %s -fnative-half-type -emit-llvm -O1 -o - | FileCheck %s
+
+
+// CHECK-LABEL: define noundef nofpclass(nan inf) <4 x float> @_Z12dstWithFloatDv4_fS_(
+// CHECK-SAME: <4 x float> noundef nofpclass(nan inf) [[P:%.*]], <4 x float> noundef nofpclass(nan inf) [[Q:%.*]]) local_unnamed_addr #[[ATTR0:[0-9]+]] {
+// CHECK-NEXT: entry:
+// CHECK-NEXT: [[VECEXT:%.*]] = extractelement <4 x float> [[P]], i64 1
+// CHECK-NEXT: [[VECEXT1:%.*]] = extractelement <4 x float> [[Q]], i64 1
+// CHECK-NEXT: [[MULRES:%.*]] = fmul reassoc nnan ninf nsz arcp afn float [[VECEXT1]], [[VECEXT]]
+// CHECK-NEXT: [[VECINIT:%.*]] = insertelement <4 x float> <float 1.000000e+00, float poison, float poison, float poison>, float [[MULRES]], i64 1
+// CHECK-NEXT: [[VECINIT3:%.*]] = shufflevector <4 x float> [[VECINIT]], <4 x float> [[P]], <4 x i32> <i32 0, i32 1, i32 6, i32 poison>
+// CHECK-NEXT: [[VECINIT5:%.*]] = shufflevector <4 x float> [[VECINIT3]], <4 x float> [[Q]], <4 x i32> <i32 0, i32 1, i32 2, i32 7>
+// CHECK-NEXT: ret <4 x float> [[VECINIT5]]
+
+float4 dstWithFloat(float4 p1, float4 p2)
+{
+ return dst(p1, p2);
+}
+
+// CHECK-LABEL: define noundef nofpclass(nan inf) <4 x half> @_Z11dstwithHalfDv4_DhS_(
+// CHECK-SAME: <4 x half> noundef nofpclass(nan inf) [[P:%.*]], <4 x half> noundef nofpclass(nan inf) [[Q:%.*]]) local_unnamed_addr #[[ATTR0]] {
+// CHECK-NEXT: entry:
+// CHECK-NEXT: [[VECEXT:%.*]] = extractelement <4 x half> [[P]], i64 1
+// CHECK-NEXT: [[VECEXT1:%.*]] = extractelement <4 x half> [[Q]], i64 1
+// CHECK-NEXT: [[MULRES:%.*]] = fmul reassoc nnan ninf nsz arcp afn half [[VECEXT1]], [[VECEXT]]
+// CHECK-NEXT: [[VECINIT:%.*]] = insertelement <4 x half> <half 0xH3C00, half poison, half poison, half poison>, half [[MULRES]], i64 1
+// CHECK-NEXT: [[VECINIT3:%.*]] = shufflevector <4 x half> [[VECINIT]], <4 x half> [[P]], <4 x i32> <i32 0, i32 1, i32 6, i32 poison>
+// CHECK-NEXT: [[VECINIT5:%.*]] = shufflevector <4 x half> [[VECINIT3]], <4 x half> [[Q]], <4 x i32> <i32 0, i32 1, i32 2, i32 7>
+// CHECK-NEXT: ret <4 x half> [[VECINIT5]]
+half4 dstwithHalf(half4 p1, half4 p2)
+{
+ return dst(p1, p2);
+}
+
+// CHECK-LABEL: define noundef nofpclass(nan inf) <4 x double> @_Z13dstWithDoubleDv4_dS_(
+// CHECK-SAME: <4 x double> noundef nofpclass(nan inf) [[P:%.*]], <4 x double> noundef nofpclass(nan inf) [[Q:%.*]]) local_unnamed_addr #[[ATTR0:[0-9]+]] {
+// CHECK-NEXT: entry:
+// CHECK-NEXT: [[VECEXT:%.*]] = extractelement <4 x double> [[P]], i64 1
+// CHECK-NEXT: [[VECEXT1:%.*]] = extractelement <4 x double> [[Q]], i64 1
+// CHECK-NEXT: [[MULRES:%.*]] = fmul reassoc nnan ninf nsz arcp afn double [[VECEXT1]], [[VECEXT]]
+// CHECK-NEXT: [[VECINIT:%.*]] = insertelement <4 x double> <double 1.000000e+00, double poison, double poison, double poison>, double [[MULRES]], i64 1
+// CHECK-NEXT: [[VECINIT3:%.*]] = shufflevector <4 x double> [[VECINIT]], <4 x double> [[P]], <4 x i32> <i32 0, i32 1, i32 6, i32 poison>
+// CHECK-NEXT: [[VECINIT5:%.*]] = shufflevector <4 x double> [[VECINIT3]], <4 x double> [[Q]], <4 x i32> <i32 0, i32 1, i32 2, i32 7>
+// CHECK-NEXT: ret <4 x double> [[VECINIT5]]
+double4 dstWithDouble(double4 p1, double4 p2)
+{
+ return dst(p1, p2);
+}
diff --git a/clang/test/SemaHLSL/BuiltIns/dst-error.hlsl b/clang/test/SemaHLSL/BuiltIns/dst-error.hlsl
new file mode 100644
index 0000000000000..6bff46ffc223b
--- /dev/null
+++ b/clang/test/SemaHLSL/BuiltIns/dst-error.hlsl
@@ -0,0 +1,37 @@
+// RUN: %clang_cc1 -finclude-default-header -triple dxil-pc-shadermodel6.6-library %s -fnative-half-type -emit-llvm-only -disable-llvm-passes -verify
+
+float4 test_too_many_arg(float4 p0)
+{
+ dst(p0, p0, p0);
+ // expected-error@-1 {{no matching function for call to 'dst'}}
+ // expected-note@hlsl/hlsl_intrinsics.h:* {{candidate function not viable: requires 2 arguments, but 3 were provided}}
+ // expected-note@hlsl/hlsl_intrinsics.h:* {{candidate function not viable: requires 2 arguments, but 3 were provided}}
+ // expected-note@hlsl/hlsl_intrinsics.h:* {{candidate function not viable: requires 2 arguments, but 3 were provided}}
+}
+
+float4 test_no_second_arg(float4 p0)
+{
+ return dst(p0);
+ // expected-error@-1 {{no matching function for call to 'dst'}}
+ // expected-note@hlsl/hlsl_intrinsics.h:* {{candidate function not viable: requires 2 arguments, but 1 was provided}}
+ // expected-note@hlsl/hlsl_intrinsics.h:* {{candidate function not viable: requires 2 arguments, but 1 was provided}}
+ // expected-note@hlsl/hlsl_intrinsics.h:* {{candidate function not viable: requires 2 arguments, but 1 was provided}}
+}
+
+float4 test_no_args()
+{
+ return dst();
+ // expected-error@-1 {{no matching function for call to 'dst'}}
+ // expected-note@hlsl/hlsl_intrinsics.h:* {{candidate function not viable: requires 2 arguments, but 0 were provided}}
+ // expected-note@hlsl/hlsl_intrinsics.h:* {{candidate function not viable: requires 2 arguments, but 0 were provided}}
+ // expected-note@hlsl/hlsl_intrinsics.h:* {{candidate function not viable: requires 2 arguments, but 0 were provided}}
+}
+
+float4 test_3_components(float3 p0, float3 p1)
+{
+ return dst(p0, p1);
+ // expected-error@-1 {{no matching function for call to 'dst'}}
+ // expected-note@hlsl/hlsl_intrinsics.h:* {{candidate function not viable: no known conversion from 'vector<[...], 3>' to 'vector<[...], 4>' for 1st argument}}
+ // expected-note@hlsl/hlsl_intrinsics.h:* {{candidate function not viable: no known conversion from 'vector<float, 3>' to 'vector<half, 4>' for 1st argument}}
+ // expected-note@hlsl/hlsl_intrinsics.h:* {{candidate function not viable: no known conversion from 'vector<float, 3>' to 'vector<double, 4>' for 1st argument}}
+}
|
@llvm/pr-subscribers-clang Author: None (metkarpoonam) ChangesImplement dst algorithm in the hlsl_intrinsics.h and added test cases for HLSL codegen and sema Full diff: https://github.com/llvm/llvm-project/pull/133828.diff 4 Files Affected:
diff --git a/clang/lib/Headers/hlsl/hlsl_intrinsic_helpers.h b/clang/lib/Headers/hlsl/hlsl_intrinsic_helpers.h
index 8cdd63d7e07bb..5ea8faf169380 100644
--- a/clang/lib/Headers/hlsl/hlsl_intrinsic_helpers.h
+++ b/clang/lib/Headers/hlsl/hlsl_intrinsic_helpers.h
@@ -35,6 +35,12 @@ length_vec_impl(vector<T, N> X) {
#endif
}
+template <typename T>
+constexpr vector<T, 4> dst_impl(vector<T, 4> src0, vector<T, 4> src1) {
+ vector<T, 4> dest = {1, src0[1] * src1[1], src0[2], src1[3]};
+ return dest;
+}
+
template <typename T> constexpr T distance_impl(T X, T Y) {
return length_impl(X - Y);
}
diff --git a/clang/lib/Headers/hlsl/hlsl_intrinsics.h b/clang/lib/Headers/hlsl/hlsl_intrinsics.h
index fd799b8d874ae..7ae94731234f9 100644
--- a/clang/lib/Headers/hlsl/hlsl_intrinsics.h
+++ b/clang/lib/Headers/hlsl/hlsl_intrinsics.h
@@ -174,6 +174,31 @@ const inline float distance(__detail::HLSL_FIXED_VECTOR<float, N> X,
__detail::HLSL_FIXED_VECTOR<float, N> Y) {
return __detail::distance_vec_impl(X, Y);
}
+//===----------------------------------------------------------------------===//
+// dst builtins
+//===----------------------------------------------------------------------===//
+
+/// \fn fvector dst( fvector, fvector)
+/// \brief Returns the length of a vector
+/// \param src0 [in] The first vector contain {_, d*d, d*d, _}
+/// \param src1 [in] The second vector contain {_, 1/d, _, 1/d}
+///
+/// Return the computed distance vector contain {1, d, d*d, 1/d}
+
+_HLSL_16BIT_AVAILABILITY(shadermodel, 6.2)
+const inline vector<half, 4> dst(vector<half, 4> src0, vector<half, 4> src1) {
+ return __detail::dst_impl(src0, src1);
+}
+
+const inline vector<float, 4> dst(vector<float, 4> src0,
+ vector<float, 4> src1) {
+ return __detail::dst_impl(src0, src1);
+}
+
+const inline vector<double, 4> dst(vector<double, 4> src0,
+ vector<double, 4> src1) {
+ return __detail::dst_impl(src0, src1);
+}
//===----------------------------------------------------------------------===//
// fmod builtins
diff --git a/clang/test/CodeGenHLSL/builtins/dst.hlsl b/clang/test/CodeGenHLSL/builtins/dst.hlsl
new file mode 100644
index 0000000000000..c62c9be5b0c1d
--- /dev/null
+++ b/clang/test/CodeGenHLSL/builtins/dst.hlsl
@@ -0,0 +1,48 @@
+// RUN: %clang_cc1 -finclude-default-header -x hlsl -triple dxil-pc-shadermodel6.2-library %s -fnative-half-type -emit-llvm -O1 -o - | FileCheck %s
+
+
+// CHECK-LABEL: define noundef nofpclass(nan inf) <4 x float> @_Z12dstWithFloatDv4_fS_(
+// CHECK-SAME: <4 x float> noundef nofpclass(nan inf) [[P:%.*]], <4 x float> noundef nofpclass(nan inf) [[Q:%.*]]) local_unnamed_addr #[[ATTR0:[0-9]+]] {
+// CHECK-NEXT: entry:
+// CHECK-NEXT: [[VECEXT:%.*]] = extractelement <4 x float> [[P]], i64 1
+// CHECK-NEXT: [[VECEXT1:%.*]] = extractelement <4 x float> [[Q]], i64 1
+// CHECK-NEXT: [[MULRES:%.*]] = fmul reassoc nnan ninf nsz arcp afn float [[VECEXT1]], [[VECEXT]]
+// CHECK-NEXT: [[VECINIT:%.*]] = insertelement <4 x float> <float 1.000000e+00, float poison, float poison, float poison>, float [[MULRES]], i64 1
+// CHECK-NEXT: [[VECINIT3:%.*]] = shufflevector <4 x float> [[VECINIT]], <4 x float> [[P]], <4 x i32> <i32 0, i32 1, i32 6, i32 poison>
+// CHECK-NEXT: [[VECINIT5:%.*]] = shufflevector <4 x float> [[VECINIT3]], <4 x float> [[Q]], <4 x i32> <i32 0, i32 1, i32 2, i32 7>
+// CHECK-NEXT: ret <4 x float> [[VECINIT5]]
+
+float4 dstWithFloat(float4 p1, float4 p2)
+{
+ return dst(p1, p2);
+}
+
+// CHECK-LABEL: define noundef nofpclass(nan inf) <4 x half> @_Z11dstwithHalfDv4_DhS_(
+// CHECK-SAME: <4 x half> noundef nofpclass(nan inf) [[P:%.*]], <4 x half> noundef nofpclass(nan inf) [[Q:%.*]]) local_unnamed_addr #[[ATTR0]] {
+// CHECK-NEXT: entry:
+// CHECK-NEXT: [[VECEXT:%.*]] = extractelement <4 x half> [[P]], i64 1
+// CHECK-NEXT: [[VECEXT1:%.*]] = extractelement <4 x half> [[Q]], i64 1
+// CHECK-NEXT: [[MULRES:%.*]] = fmul reassoc nnan ninf nsz arcp afn half [[VECEXT1]], [[VECEXT]]
+// CHECK-NEXT: [[VECINIT:%.*]] = insertelement <4 x half> <half 0xH3C00, half poison, half poison, half poison>, half [[MULRES]], i64 1
+// CHECK-NEXT: [[VECINIT3:%.*]] = shufflevector <4 x half> [[VECINIT]], <4 x half> [[P]], <4 x i32> <i32 0, i32 1, i32 6, i32 poison>
+// CHECK-NEXT: [[VECINIT5:%.*]] = shufflevector <4 x half> [[VECINIT3]], <4 x half> [[Q]], <4 x i32> <i32 0, i32 1, i32 2, i32 7>
+// CHECK-NEXT: ret <4 x half> [[VECINIT5]]
+half4 dstwithHalf(half4 p1, half4 p2)
+{
+ return dst(p1, p2);
+}
+
+// CHECK-LABEL: define noundef nofpclass(nan inf) <4 x double> @_Z13dstWithDoubleDv4_dS_(
+// CHECK-SAME: <4 x double> noundef nofpclass(nan inf) [[P:%.*]], <4 x double> noundef nofpclass(nan inf) [[Q:%.*]]) local_unnamed_addr #[[ATTR0:[0-9]+]] {
+// CHECK-NEXT: entry:
+// CHECK-NEXT: [[VECEXT:%.*]] = extractelement <4 x double> [[P]], i64 1
+// CHECK-NEXT: [[VECEXT1:%.*]] = extractelement <4 x double> [[Q]], i64 1
+// CHECK-NEXT: [[MULRES:%.*]] = fmul reassoc nnan ninf nsz arcp afn double [[VECEXT1]], [[VECEXT]]
+// CHECK-NEXT: [[VECINIT:%.*]] = insertelement <4 x double> <double 1.000000e+00, double poison, double poison, double poison>, double [[MULRES]], i64 1
+// CHECK-NEXT: [[VECINIT3:%.*]] = shufflevector <4 x double> [[VECINIT]], <4 x double> [[P]], <4 x i32> <i32 0, i32 1, i32 6, i32 poison>
+// CHECK-NEXT: [[VECINIT5:%.*]] = shufflevector <4 x double> [[VECINIT3]], <4 x double> [[Q]], <4 x i32> <i32 0, i32 1, i32 2, i32 7>
+// CHECK-NEXT: ret <4 x double> [[VECINIT5]]
+double4 dstWithDouble(double4 p1, double4 p2)
+{
+ return dst(p1, p2);
+}
diff --git a/clang/test/SemaHLSL/BuiltIns/dst-error.hlsl b/clang/test/SemaHLSL/BuiltIns/dst-error.hlsl
new file mode 100644
index 0000000000000..6bff46ffc223b
--- /dev/null
+++ b/clang/test/SemaHLSL/BuiltIns/dst-error.hlsl
@@ -0,0 +1,37 @@
+// RUN: %clang_cc1 -finclude-default-header -triple dxil-pc-shadermodel6.6-library %s -fnative-half-type -emit-llvm-only -disable-llvm-passes -verify
+
+float4 test_too_many_arg(float4 p0)
+{
+ dst(p0, p0, p0);
+ // expected-error@-1 {{no matching function for call to 'dst'}}
+ // expected-note@hlsl/hlsl_intrinsics.h:* {{candidate function not viable: requires 2 arguments, but 3 were provided}}
+ // expected-note@hlsl/hlsl_intrinsics.h:* {{candidate function not viable: requires 2 arguments, but 3 were provided}}
+ // expected-note@hlsl/hlsl_intrinsics.h:* {{candidate function not viable: requires 2 arguments, but 3 were provided}}
+}
+
+float4 test_no_second_arg(float4 p0)
+{
+ return dst(p0);
+ // expected-error@-1 {{no matching function for call to 'dst'}}
+ // expected-note@hlsl/hlsl_intrinsics.h:* {{candidate function not viable: requires 2 arguments, but 1 was provided}}
+ // expected-note@hlsl/hlsl_intrinsics.h:* {{candidate function not viable: requires 2 arguments, but 1 was provided}}
+ // expected-note@hlsl/hlsl_intrinsics.h:* {{candidate function not viable: requires 2 arguments, but 1 was provided}}
+}
+
+float4 test_no_args()
+{
+ return dst();
+ // expected-error@-1 {{no matching function for call to 'dst'}}
+ // expected-note@hlsl/hlsl_intrinsics.h:* {{candidate function not viable: requires 2 arguments, but 0 were provided}}
+ // expected-note@hlsl/hlsl_intrinsics.h:* {{candidate function not viable: requires 2 arguments, but 0 were provided}}
+ // expected-note@hlsl/hlsl_intrinsics.h:* {{candidate function not viable: requires 2 arguments, but 0 were provided}}
+}
+
+float4 test_3_components(float3 p0, float3 p1)
+{
+ return dst(p0, p1);
+ // expected-error@-1 {{no matching function for call to 'dst'}}
+ // expected-note@hlsl/hlsl_intrinsics.h:* {{candidate function not viable: no known conversion from 'vector<[...], 3>' to 'vector<[...], 4>' for 1st argument}}
+ // expected-note@hlsl/hlsl_intrinsics.h:* {{candidate function not viable: no known conversion from 'vector<float, 3>' to 'vector<half, 4>' for 1st argument}}
+ // expected-note@hlsl/hlsl_intrinsics.h:* {{candidate function not viable: no known conversion from 'vector<float, 3>' to 'vector<double, 4>' for 1st argument}}
+}
|
… case with arg1 as float and arg2 as float4
… the test from dst.hlsl
//===----------------------------------------------------------------------===// | ||
|
||
/// \fn fvector dst( fvector, fvector) | ||
/// \brief Returns the length of a vector |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This description sounds off to me. Is this quoted from official documentation anywhere? To me the "length" of a float4
is 4
.
https://learn.microsoft.com/en-us/windows/win32/direct3dhlsl/dst says Returns the computed distance vector
but any official documentation is probably fine
|
||
/// \fn fvector dst( fvector, fvector) | ||
/// \brief Returns the length of a vector | ||
/// \param Src0 [in] The first vector contain {_, d*d, d*d, _} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar idea here. I'm not sure that contain {_, d*d, d*d, _}
isn't very helpful without context
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The src0 provides the squared distance components (d * d) and is assumed to be the vector (ignored, d*d, d*d, ignored)
The Src1 provides the reciprocal distance components(1/d) and is assumed to be the vector (ignored, 1/d, ignored, 1/d)
/// \param Src0 [in] The first vector contain {_, d*d, d*d, _} | ||
/// \param Src1 [in] The second vector contain {_, 1/d, _, 1/d} | ||
/// | ||
/// Return the computed distance vector contain {1, d, d*d, 1/d} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
contain {1, d, d*d, 1/d}
here is also confusing to me
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The combination of squared distance (src0) and reciprocal distance (src1) enables the dst function to efficiently compute a distance vector.
dest = (1, d*d, d*d,1) * (1, 1/d, 1, 1/d)
dest = (1, d, d*d, 1/d)
https://stackoverflow.com/questions/8525803/what-is-the-hlsl-dst-instruction-for
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think the agorithm correctness is the problem. Its confusing because no one knows what d is. your inputs are Src0 and Src1. You need to put your documentation in those terms.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, that was the point I was making as well. Think about the comment header from the perspective of a random user who is being shown the comment as an editor popup. It needs to be useful in that context
// RUN: %clang_cc1 -finclude-default-header -x hlsl -triple dxil-pc-shadermodel6.2-library %s -fnative-half-type -emit-llvm -disable-llvm-passes -o - | FileCheck %s | ||
|
||
|
||
// CHECK-LABEL: linkonce_odr noundef nofpclass(nan inf) <4 x float> @_ZN4hlsl8__detail8dst_implIfEEDv4_T_S3_S3_( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are far too specific and will probably break in the future. Please simplify to the minimum thing that you need to check. In this case that is the following:
- the
define
keyword - the expected type (<4 x float>)
- the demangled function name (dst_impl)
Something like
// CHECK-LABEL: define {{alpha/space/paren regex}} <4 x float> @{{alpha/num/under regex}}dst_impl{{regex}}(
All the CHECK-LABEL
s should be updated
|
||
|
||
// CHECK-LABEL: linkonce_odr noundef nofpclass(nan inf) <4 x float> @_ZN4hlsl8__detail8dst_implIfEEDv4_T_S3_S3_( | ||
// CHECK-SAME: <4 x float> noundef nofpclass(nan inf) [[P:%.*]], <4 x float> noundef nofpclass(nan inf) [[Q:%.*]]) #[[ATTR0:[0-9]+]] { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same idea here. We don't care about noundef nofpclass(nan inf)
stuff so the check shouldn't require them
// CHECK-SAME: <4 x float> noundef nofpclass(nan inf) [[P:%.*]], <4 x float> noundef nofpclass(nan inf) [[Q:%.*]]) #[[ATTR0:[0-9]+]] { | ||
// CHECK: [[VECEXT:%.*]] = extractelement <4 x float> [[PADDR:%.*]], i32 1 | ||
// CHECK: [[VECEXT1:%.*]] = extractelement <4 x float> [[QADDR:%.*]], i32 1 | ||
// CHECK: [[MULRES:%.*]] = fmul reassoc nnan ninf nsz arcp afn float [[VECEXT]], [[VECEXT1]] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not as confident here, but I think it also applies.
We care about fmul
, float
, vecext
, and vecext1
so we should only check those things
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we have been very inconsistent here. I'm fine if this stays but have seen others wild card it out.
/// Return the computed distance vector contain {1, d, d*d, 1/d} | ||
|
||
_HLSL_16BIT_AVAILABILITY(shadermodel, 6.2) | ||
const inline vector<half, 4> dst(vector<half, 4> Src0, vector<half, 4> Src1) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We tend to use the more common typedefs unless we're using a template-dependent vector.
const inline vector<half, 4> dst(vector<half, 4> Src0, vector<half, 4> Src1) { | |
const inline half4 dst(half4 Src0, half4 Src1) { |
const inline vector<float, 4> dst(vector<float, 4> Src0, | ||
vector<float, 4> Src1) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
const inline vector<float, 4> dst(vector<float, 4> Src0, | |
vector<float, 4> Src1) { | |
const inline float4 dst(float4 Src0, float4 Src1) { |
const inline vector<double, 4> dst(vector<double, 4> Src0, | ||
vector<double, 4> Src1) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
const inline vector<double, 4> dst(vector<double, 4> Src0, | |
vector<double, 4> Src1) { | |
const inline double4 dst(double4 Src0, double4 Src1) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the suggestion. I have updated the code to use half4, float4, and double4.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are no code changes to Sema in this PR, so this test is only testing existing functionality that is well tested within clang. We should remove it so as to not increase testing time without adding meaningful test coverage.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have removed the dst-error.hlsl file as suggested.
…n, eliminating the redundant variable, and updating the dst.hlsl file accordingly
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/59/builds/15721 Here is the relevant piece of the build log for the reference
|
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/190/builds/17984 Here is the relevant piece of the build log for the reference
|
fixes: llvm#99108 Implement dst algorithm in the hlsl_intrinsics.h and added test cases for HLSL codegen and sema - [x] implement dst algorithm in the hlsl_intrinsics.h - [x] Add HLSL codegen tests to clang/test/CodeGenHLSL/builtins/dst.hlsl - [x] Add sema tests to clang/test/SemaHLSL/BuiltIns/dst-errors.hlsl
fixes: llvm#99108 Implement dst algorithm in the hlsl_intrinsics.h and added test cases for HLSL codegen and sema - [x] implement dst algorithm in the hlsl_intrinsics.h - [x] Add HLSL codegen tests to clang/test/CodeGenHLSL/builtins/dst.hlsl - [x] Add sema tests to clang/test/SemaHLSL/BuiltIns/dst-errors.hlsl
fixes: #99108
Implement dst algorithm in the hlsl_intrinsics.h and added test cases for HLSL codegen and sema
implement dst algorithm in the hlsl_intrinsics.h
Add HLSL codegen tests to clang/test/CodeGenHLSL/builtins/dst.hlsl
Add sema tests to clang/test/SemaHLSL/BuiltIns/dst-errors.hlsl