Skip to content

[HLSL] Implement the dst HLSL Function #133828

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 18 commits into from
Apr 9, 2025
Merged

Conversation

metkarpoonam
Copy link
Contributor

@metkarpoonam metkarpoonam commented Apr 1, 2025

fixes: #99108
Implement dst algorithm in the hlsl_intrinsics.h and added test cases for HLSL codegen and sema

  • implement dst algorithm in the hlsl_intrinsics.h

  • Add HLSL codegen tests to clang/test/CodeGenHLSL/builtins/dst.hlsl

  • Add sema tests to clang/test/SemaHLSL/BuiltIns/dst-errors.hlsl

@llvmbot llvmbot added clang Clang issues not falling into any other category backend:X86 clang:headers Headers provided by Clang, e.g. for intrinsics HLSL HLSL Language Support labels Apr 1, 2025
@llvmbot
Copy link
Member

llvmbot commented Apr 1, 2025

@llvm/pr-subscribers-hlsl

@llvm/pr-subscribers-backend-x86

Author: None (metkarpoonam)

Changes

Implement dst algorithm in the hlsl_intrinsics.h and added test cases for HLSL codegen and sema
fixes: #99108


Full diff: https://github.com/llvm/llvm-project/pull/133828.diff

4 Files Affected:

  • (modified) clang/lib/Headers/hlsl/hlsl_intrinsic_helpers.h (+6)
  • (modified) clang/lib/Headers/hlsl/hlsl_intrinsics.h (+25)
  • (added) clang/test/CodeGenHLSL/builtins/dst.hlsl (+48)
  • (added) clang/test/SemaHLSL/BuiltIns/dst-error.hlsl (+37)
diff --git a/clang/lib/Headers/hlsl/hlsl_intrinsic_helpers.h b/clang/lib/Headers/hlsl/hlsl_intrinsic_helpers.h
index 8cdd63d7e07bb..5ea8faf169380 100644
--- a/clang/lib/Headers/hlsl/hlsl_intrinsic_helpers.h
+++ b/clang/lib/Headers/hlsl/hlsl_intrinsic_helpers.h
@@ -35,6 +35,12 @@ length_vec_impl(vector<T, N> X) {
 #endif
 }
 
+template <typename T>
+constexpr vector<T, 4> dst_impl(vector<T, 4> src0, vector<T, 4> src1) {
+  vector<T, 4> dest = {1, src0[1] * src1[1], src0[2], src1[3]};
+  return dest;
+}
+
 template <typename T> constexpr T distance_impl(T X, T Y) {
   return length_impl(X - Y);
 }
diff --git a/clang/lib/Headers/hlsl/hlsl_intrinsics.h b/clang/lib/Headers/hlsl/hlsl_intrinsics.h
index fd799b8d874ae..7ae94731234f9 100644
--- a/clang/lib/Headers/hlsl/hlsl_intrinsics.h
+++ b/clang/lib/Headers/hlsl/hlsl_intrinsics.h
@@ -174,6 +174,31 @@ const inline float distance(__detail::HLSL_FIXED_VECTOR<float, N> X,
                             __detail::HLSL_FIXED_VECTOR<float, N> Y) {
   return __detail::distance_vec_impl(X, Y);
 }
+//===----------------------------------------------------------------------===//
+// dst builtins
+//===----------------------------------------------------------------------===//
+
+/// \fn fvector dst( fvector, fvector)
+/// \brief Returns the length of a vector
+/// \param src0 [in] The first vector contain {_, d*d, d*d, _}
+/// \param src1 [in] The second vector contain {_, 1/d, _, 1/d}
+///
+/// Return the computed distance vector contain {1, d, d*d, 1/d}
+
+_HLSL_16BIT_AVAILABILITY(shadermodel, 6.2)
+const inline vector<half, 4> dst(vector<half, 4> src0, vector<half, 4> src1) {
+  return __detail::dst_impl(src0, src1);
+}
+
+const inline vector<float, 4> dst(vector<float, 4> src0,
+                                  vector<float, 4> src1) {
+  return __detail::dst_impl(src0, src1);
+}
+
+const inline vector<double, 4> dst(vector<double, 4> src0,
+                                   vector<double, 4> src1) {
+  return __detail::dst_impl(src0, src1);
+}
 
 //===----------------------------------------------------------------------===//
 // fmod builtins
diff --git a/clang/test/CodeGenHLSL/builtins/dst.hlsl b/clang/test/CodeGenHLSL/builtins/dst.hlsl
new file mode 100644
index 0000000000000..c62c9be5b0c1d
--- /dev/null
+++ b/clang/test/CodeGenHLSL/builtins/dst.hlsl
@@ -0,0 +1,48 @@
+// RUN: %clang_cc1 -finclude-default-header -x hlsl -triple dxil-pc-shadermodel6.2-library %s -fnative-half-type -emit-llvm -O1 -o - | FileCheck %s
+
+
+// CHECK-LABEL: define noundef nofpclass(nan inf) <4 x float> @_Z12dstWithFloatDv4_fS_(
+// CHECK-SAME: <4 x float> noundef nofpclass(nan inf) [[P:%.*]], <4 x float> noundef nofpclass(nan inf) [[Q:%.*]]) local_unnamed_addr #[[ATTR0:[0-9]+]] { 
+// CHECK-NEXT: entry:
+// CHECK-NEXT: [[VECEXT:%.*]] = extractelement <4 x float> [[P]], i64 1
+// CHECK-NEXT: [[VECEXT1:%.*]] = extractelement <4 x float> [[Q]], i64 1
+// CHECK-NEXT: [[MULRES:%.*]] = fmul reassoc nnan ninf nsz arcp afn float [[VECEXT1]], [[VECEXT]]
+// CHECK-NEXT: [[VECINIT:%.*]] = insertelement <4 x float> <float 1.000000e+00, float poison, float poison, float poison>, float [[MULRES]], i64 1
+// CHECK-NEXT: [[VECINIT3:%.*]] = shufflevector <4 x float> [[VECINIT]], <4 x float> [[P]], <4 x i32> <i32 0, i32 1, i32 6, i32 poison>
+// CHECK-NEXT: [[VECINIT5:%.*]] = shufflevector <4 x float> [[VECINIT3]], <4 x float> [[Q]], <4 x i32> <i32 0, i32 1, i32 2, i32 7>
+// CHECK-NEXT: ret <4 x float> [[VECINIT5]]
+
+float4 dstWithFloat(float4 p1, float4 p2)
+{
+    return dst(p1, p2);
+}
+
+// CHECK-LABEL: define noundef nofpclass(nan inf) <4 x half> @_Z11dstwithHalfDv4_DhS_(
+// CHECK-SAME: <4 x half> noundef nofpclass(nan inf) [[P:%.*]], <4 x half> noundef nofpclass(nan inf) [[Q:%.*]]) local_unnamed_addr #[[ATTR0]] {
+// CHECK-NEXT: entry:
+// CHECK-NEXT: [[VECEXT:%.*]] = extractelement <4 x half> [[P]], i64 1
+// CHECK-NEXT: [[VECEXT1:%.*]] = extractelement <4 x half> [[Q]], i64 1
+// CHECK-NEXT: [[MULRES:%.*]] = fmul reassoc nnan ninf nsz arcp afn half [[VECEXT1]], [[VECEXT]]
+// CHECK-NEXT: [[VECINIT:%.*]] = insertelement <4 x half> <half 0xH3C00, half poison, half poison, half poison>, half [[MULRES]], i64 1
+// CHECK-NEXT: [[VECINIT3:%.*]] = shufflevector <4 x half> [[VECINIT]], <4 x half> [[P]], <4 x i32> <i32 0, i32 1, i32 6, i32 poison>
+// CHECK-NEXT: [[VECINIT5:%.*]] = shufflevector <4 x half> [[VECINIT3]], <4 x half> [[Q]], <4 x i32> <i32 0, i32 1, i32 2, i32 7>
+// CHECK-NEXT: ret <4 x half> [[VECINIT5]]
+half4 dstwithHalf(half4 p1, half4 p2)
+{
+    return dst(p1, p2);
+}
+
+// CHECK-LABEL: define noundef nofpclass(nan inf) <4 x double> @_Z13dstWithDoubleDv4_dS_(
+// CHECK-SAME: <4 x double> noundef nofpclass(nan inf) [[P:%.*]], <4 x double> noundef nofpclass(nan inf) [[Q:%.*]]) local_unnamed_addr #[[ATTR0:[0-9]+]] { 
+// CHECK-NEXT: entry:
+// CHECK-NEXT: [[VECEXT:%.*]] = extractelement <4 x double> [[P]], i64 1
+// CHECK-NEXT: [[VECEXT1:%.*]] = extractelement <4 x double> [[Q]], i64 1
+// CHECK-NEXT: [[MULRES:%.*]] = fmul reassoc nnan ninf nsz arcp afn double [[VECEXT1]], [[VECEXT]]
+// CHECK-NEXT: [[VECINIT:%.*]] = insertelement <4 x double> <double 1.000000e+00, double poison, double poison, double poison>, double [[MULRES]], i64 1
+// CHECK-NEXT: [[VECINIT3:%.*]] = shufflevector <4 x double> [[VECINIT]], <4 x double> [[P]], <4 x i32> <i32 0, i32 1, i32 6, i32 poison>
+// CHECK-NEXT: [[VECINIT5:%.*]] = shufflevector <4 x double> [[VECINIT3]], <4 x double> [[Q]], <4 x i32> <i32 0, i32 1, i32 2, i32 7>
+// CHECK-NEXT: ret <4 x double> [[VECINIT5]]
+double4 dstWithDouble(double4 p1, double4 p2)
+{
+    return dst(p1, p2);
+}
diff --git a/clang/test/SemaHLSL/BuiltIns/dst-error.hlsl b/clang/test/SemaHLSL/BuiltIns/dst-error.hlsl
new file mode 100644
index 0000000000000..6bff46ffc223b
--- /dev/null
+++ b/clang/test/SemaHLSL/BuiltIns/dst-error.hlsl
@@ -0,0 +1,37 @@
+// RUN: %clang_cc1 -finclude-default-header -triple dxil-pc-shadermodel6.6-library %s -fnative-half-type -emit-llvm-only -disable-llvm-passes -verify
+
+float4 test_too_many_arg(float4 p0)
+{
+    dst(p0, p0, p0);
+  // expected-error@-1 {{no matching function for call to 'dst'}}
+  // expected-note@hlsl/hlsl_intrinsics.h:* {{candidate function not viable: requires 2 arguments, but 3 were provided}}
+  // expected-note@hlsl/hlsl_intrinsics.h:* {{candidate function not viable: requires 2 arguments, but 3 were provided}}
+  // expected-note@hlsl/hlsl_intrinsics.h:* {{candidate function not viable: requires 2 arguments, but 3 were provided}}
+}
+
+float4 test_no_second_arg(float4 p0)
+{
+    return dst(p0);
+  // expected-error@-1 {{no matching function for call to 'dst'}}
+  // expected-note@hlsl/hlsl_intrinsics.h:* {{candidate function not viable: requires 2 arguments, but 1 was provided}}
+  // expected-note@hlsl/hlsl_intrinsics.h:* {{candidate function not viable: requires 2 arguments, but 1 was provided}}
+  // expected-note@hlsl/hlsl_intrinsics.h:* {{candidate function not viable: requires 2 arguments, but 1 was provided}}
+}
+
+float4 test_no_args()
+{
+    return dst();
+  // expected-error@-1 {{no matching function for call to 'dst'}}
+  // expected-note@hlsl/hlsl_intrinsics.h:* {{candidate function not viable: requires 2 arguments, but 0 were provided}}
+  // expected-note@hlsl/hlsl_intrinsics.h:* {{candidate function not viable: requires 2 arguments, but 0 were provided}}
+  // expected-note@hlsl/hlsl_intrinsics.h:* {{candidate function not viable: requires 2 arguments, but 0 were provided}}
+}
+
+float4 test_3_components(float3 p0, float3 p1)
+{
+    return dst(p0, p1);
+  // expected-error@-1 {{no matching function for call to 'dst'}}
+  // expected-note@hlsl/hlsl_intrinsics.h:* {{candidate function not viable: no known conversion from 'vector<[...], 3>' to 'vector<[...], 4>' for 1st argument}}
+  // expected-note@hlsl/hlsl_intrinsics.h:* {{candidate function not viable: no known conversion from 'vector<float, 3>' to 'vector<half, 4>' for 1st argument}}
+  // expected-note@hlsl/hlsl_intrinsics.h:* {{candidate function not viable: no known conversion from 'vector<float, 3>' to 'vector<double, 4>' for 1st argument}}
+}

@llvmbot
Copy link
Member

llvmbot commented Apr 1, 2025

@llvm/pr-subscribers-clang

Author: None (metkarpoonam)

Changes

Implement dst algorithm in the hlsl_intrinsics.h and added test cases for HLSL codegen and sema
fixes: #99108


Full diff: https://github.com/llvm/llvm-project/pull/133828.diff

4 Files Affected:

  • (modified) clang/lib/Headers/hlsl/hlsl_intrinsic_helpers.h (+6)
  • (modified) clang/lib/Headers/hlsl/hlsl_intrinsics.h (+25)
  • (added) clang/test/CodeGenHLSL/builtins/dst.hlsl (+48)
  • (added) clang/test/SemaHLSL/BuiltIns/dst-error.hlsl (+37)
diff --git a/clang/lib/Headers/hlsl/hlsl_intrinsic_helpers.h b/clang/lib/Headers/hlsl/hlsl_intrinsic_helpers.h
index 8cdd63d7e07bb..5ea8faf169380 100644
--- a/clang/lib/Headers/hlsl/hlsl_intrinsic_helpers.h
+++ b/clang/lib/Headers/hlsl/hlsl_intrinsic_helpers.h
@@ -35,6 +35,12 @@ length_vec_impl(vector<T, N> X) {
 #endif
 }
 
+template <typename T>
+constexpr vector<T, 4> dst_impl(vector<T, 4> src0, vector<T, 4> src1) {
+  vector<T, 4> dest = {1, src0[1] * src1[1], src0[2], src1[3]};
+  return dest;
+}
+
 template <typename T> constexpr T distance_impl(T X, T Y) {
   return length_impl(X - Y);
 }
diff --git a/clang/lib/Headers/hlsl/hlsl_intrinsics.h b/clang/lib/Headers/hlsl/hlsl_intrinsics.h
index fd799b8d874ae..7ae94731234f9 100644
--- a/clang/lib/Headers/hlsl/hlsl_intrinsics.h
+++ b/clang/lib/Headers/hlsl/hlsl_intrinsics.h
@@ -174,6 +174,31 @@ const inline float distance(__detail::HLSL_FIXED_VECTOR<float, N> X,
                             __detail::HLSL_FIXED_VECTOR<float, N> Y) {
   return __detail::distance_vec_impl(X, Y);
 }
+//===----------------------------------------------------------------------===//
+// dst builtins
+//===----------------------------------------------------------------------===//
+
+/// \fn fvector dst( fvector, fvector)
+/// \brief Returns the length of a vector
+/// \param src0 [in] The first vector contain {_, d*d, d*d, _}
+/// \param src1 [in] The second vector contain {_, 1/d, _, 1/d}
+///
+/// Return the computed distance vector contain {1, d, d*d, 1/d}
+
+_HLSL_16BIT_AVAILABILITY(shadermodel, 6.2)
+const inline vector<half, 4> dst(vector<half, 4> src0, vector<half, 4> src1) {
+  return __detail::dst_impl(src0, src1);
+}
+
+const inline vector<float, 4> dst(vector<float, 4> src0,
+                                  vector<float, 4> src1) {
+  return __detail::dst_impl(src0, src1);
+}
+
+const inline vector<double, 4> dst(vector<double, 4> src0,
+                                   vector<double, 4> src1) {
+  return __detail::dst_impl(src0, src1);
+}
 
 //===----------------------------------------------------------------------===//
 // fmod builtins
diff --git a/clang/test/CodeGenHLSL/builtins/dst.hlsl b/clang/test/CodeGenHLSL/builtins/dst.hlsl
new file mode 100644
index 0000000000000..c62c9be5b0c1d
--- /dev/null
+++ b/clang/test/CodeGenHLSL/builtins/dst.hlsl
@@ -0,0 +1,48 @@
+// RUN: %clang_cc1 -finclude-default-header -x hlsl -triple dxil-pc-shadermodel6.2-library %s -fnative-half-type -emit-llvm -O1 -o - | FileCheck %s
+
+
+// CHECK-LABEL: define noundef nofpclass(nan inf) <4 x float> @_Z12dstWithFloatDv4_fS_(
+// CHECK-SAME: <4 x float> noundef nofpclass(nan inf) [[P:%.*]], <4 x float> noundef nofpclass(nan inf) [[Q:%.*]]) local_unnamed_addr #[[ATTR0:[0-9]+]] { 
+// CHECK-NEXT: entry:
+// CHECK-NEXT: [[VECEXT:%.*]] = extractelement <4 x float> [[P]], i64 1
+// CHECK-NEXT: [[VECEXT1:%.*]] = extractelement <4 x float> [[Q]], i64 1
+// CHECK-NEXT: [[MULRES:%.*]] = fmul reassoc nnan ninf nsz arcp afn float [[VECEXT1]], [[VECEXT]]
+// CHECK-NEXT: [[VECINIT:%.*]] = insertelement <4 x float> <float 1.000000e+00, float poison, float poison, float poison>, float [[MULRES]], i64 1
+// CHECK-NEXT: [[VECINIT3:%.*]] = shufflevector <4 x float> [[VECINIT]], <4 x float> [[P]], <4 x i32> <i32 0, i32 1, i32 6, i32 poison>
+// CHECK-NEXT: [[VECINIT5:%.*]] = shufflevector <4 x float> [[VECINIT3]], <4 x float> [[Q]], <4 x i32> <i32 0, i32 1, i32 2, i32 7>
+// CHECK-NEXT: ret <4 x float> [[VECINIT5]]
+
+float4 dstWithFloat(float4 p1, float4 p2)
+{
+    return dst(p1, p2);
+}
+
+// CHECK-LABEL: define noundef nofpclass(nan inf) <4 x half> @_Z11dstwithHalfDv4_DhS_(
+// CHECK-SAME: <4 x half> noundef nofpclass(nan inf) [[P:%.*]], <4 x half> noundef nofpclass(nan inf) [[Q:%.*]]) local_unnamed_addr #[[ATTR0]] {
+// CHECK-NEXT: entry:
+// CHECK-NEXT: [[VECEXT:%.*]] = extractelement <4 x half> [[P]], i64 1
+// CHECK-NEXT: [[VECEXT1:%.*]] = extractelement <4 x half> [[Q]], i64 1
+// CHECK-NEXT: [[MULRES:%.*]] = fmul reassoc nnan ninf nsz arcp afn half [[VECEXT1]], [[VECEXT]]
+// CHECK-NEXT: [[VECINIT:%.*]] = insertelement <4 x half> <half 0xH3C00, half poison, half poison, half poison>, half [[MULRES]], i64 1
+// CHECK-NEXT: [[VECINIT3:%.*]] = shufflevector <4 x half> [[VECINIT]], <4 x half> [[P]], <4 x i32> <i32 0, i32 1, i32 6, i32 poison>
+// CHECK-NEXT: [[VECINIT5:%.*]] = shufflevector <4 x half> [[VECINIT3]], <4 x half> [[Q]], <4 x i32> <i32 0, i32 1, i32 2, i32 7>
+// CHECK-NEXT: ret <4 x half> [[VECINIT5]]
+half4 dstwithHalf(half4 p1, half4 p2)
+{
+    return dst(p1, p2);
+}
+
+// CHECK-LABEL: define noundef nofpclass(nan inf) <4 x double> @_Z13dstWithDoubleDv4_dS_(
+// CHECK-SAME: <4 x double> noundef nofpclass(nan inf) [[P:%.*]], <4 x double> noundef nofpclass(nan inf) [[Q:%.*]]) local_unnamed_addr #[[ATTR0:[0-9]+]] { 
+// CHECK-NEXT: entry:
+// CHECK-NEXT: [[VECEXT:%.*]] = extractelement <4 x double> [[P]], i64 1
+// CHECK-NEXT: [[VECEXT1:%.*]] = extractelement <4 x double> [[Q]], i64 1
+// CHECK-NEXT: [[MULRES:%.*]] = fmul reassoc nnan ninf nsz arcp afn double [[VECEXT1]], [[VECEXT]]
+// CHECK-NEXT: [[VECINIT:%.*]] = insertelement <4 x double> <double 1.000000e+00, double poison, double poison, double poison>, double [[MULRES]], i64 1
+// CHECK-NEXT: [[VECINIT3:%.*]] = shufflevector <4 x double> [[VECINIT]], <4 x double> [[P]], <4 x i32> <i32 0, i32 1, i32 6, i32 poison>
+// CHECK-NEXT: [[VECINIT5:%.*]] = shufflevector <4 x double> [[VECINIT3]], <4 x double> [[Q]], <4 x i32> <i32 0, i32 1, i32 2, i32 7>
+// CHECK-NEXT: ret <4 x double> [[VECINIT5]]
+double4 dstWithDouble(double4 p1, double4 p2)
+{
+    return dst(p1, p2);
+}
diff --git a/clang/test/SemaHLSL/BuiltIns/dst-error.hlsl b/clang/test/SemaHLSL/BuiltIns/dst-error.hlsl
new file mode 100644
index 0000000000000..6bff46ffc223b
--- /dev/null
+++ b/clang/test/SemaHLSL/BuiltIns/dst-error.hlsl
@@ -0,0 +1,37 @@
+// RUN: %clang_cc1 -finclude-default-header -triple dxil-pc-shadermodel6.6-library %s -fnative-half-type -emit-llvm-only -disable-llvm-passes -verify
+
+float4 test_too_many_arg(float4 p0)
+{
+    dst(p0, p0, p0);
+  // expected-error@-1 {{no matching function for call to 'dst'}}
+  // expected-note@hlsl/hlsl_intrinsics.h:* {{candidate function not viable: requires 2 arguments, but 3 were provided}}
+  // expected-note@hlsl/hlsl_intrinsics.h:* {{candidate function not viable: requires 2 arguments, but 3 were provided}}
+  // expected-note@hlsl/hlsl_intrinsics.h:* {{candidate function not viable: requires 2 arguments, but 3 were provided}}
+}
+
+float4 test_no_second_arg(float4 p0)
+{
+    return dst(p0);
+  // expected-error@-1 {{no matching function for call to 'dst'}}
+  // expected-note@hlsl/hlsl_intrinsics.h:* {{candidate function not viable: requires 2 arguments, but 1 was provided}}
+  // expected-note@hlsl/hlsl_intrinsics.h:* {{candidate function not viable: requires 2 arguments, but 1 was provided}}
+  // expected-note@hlsl/hlsl_intrinsics.h:* {{candidate function not viable: requires 2 arguments, but 1 was provided}}
+}
+
+float4 test_no_args()
+{
+    return dst();
+  // expected-error@-1 {{no matching function for call to 'dst'}}
+  // expected-note@hlsl/hlsl_intrinsics.h:* {{candidate function not viable: requires 2 arguments, but 0 were provided}}
+  // expected-note@hlsl/hlsl_intrinsics.h:* {{candidate function not viable: requires 2 arguments, but 0 were provided}}
+  // expected-note@hlsl/hlsl_intrinsics.h:* {{candidate function not viable: requires 2 arguments, but 0 were provided}}
+}
+
+float4 test_3_components(float3 p0, float3 p1)
+{
+    return dst(p0, p1);
+  // expected-error@-1 {{no matching function for call to 'dst'}}
+  // expected-note@hlsl/hlsl_intrinsics.h:* {{candidate function not viable: no known conversion from 'vector<[...], 3>' to 'vector<[...], 4>' for 1st argument}}
+  // expected-note@hlsl/hlsl_intrinsics.h:* {{candidate function not viable: no known conversion from 'vector<float, 3>' to 'vector<half, 4>' for 1st argument}}
+  // expected-note@hlsl/hlsl_intrinsics.h:* {{candidate function not viable: no known conversion from 'vector<float, 3>' to 'vector<double, 4>' for 1st argument}}
+}

@metkarpoonam metkarpoonam requested a review from farzonl April 4, 2025 20:08
@metkarpoonam metkarpoonam changed the title Hlsl dst function [Hlsl] Implement the dst HLSL Function Apr 4, 2025
@metkarpoonam metkarpoonam changed the title [Hlsl] Implement the dst HLSL Function [HLSL] Implement the dst HLSL Function Apr 4, 2025
//===----------------------------------------------------------------------===//

/// \fn fvector dst( fvector, fvector)
/// \brief Returns the length of a vector
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This description sounds off to me. Is this quoted from official documentation anywhere? To me the "length" of a float4 is 4.

https://learn.microsoft.com/en-us/windows/win32/direct3dhlsl/dst says Returns the computed distance vector but any official documentation is probably fine


/// \fn fvector dst( fvector, fvector)
/// \brief Returns the length of a vector
/// \param Src0 [in] The first vector contain {_, d*d, d*d, _}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar idea here. I'm not sure that contain {_, d*d, d*d, _} isn't very helpful without context

Copy link
Contributor Author

@metkarpoonam metkarpoonam Apr 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The src0 provides the squared distance components (d * d) and is assumed to be the vector (ignored, d*d, d*d, ignored)

The Src1 provides the reciprocal distance components(1/d) and is assumed to be the vector (ignored, 1/d, ignored, 1/d)

https://github.com/MicrosoftDocs/win32/blob/docs/desktop-src//direct3dhlsl/dst---vs.md

/// \param Src0 [in] The first vector contain {_, d*d, d*d, _}
/// \param Src1 [in] The second vector contain {_, 1/d, _, 1/d}
///
/// Return the computed distance vector contain {1, d, d*d, 1/d}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

contain {1, d, d*d, 1/d} here is also confusing to me

Copy link
Contributor Author

@metkarpoonam metkarpoonam Apr 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The combination of squared distance (src0) and reciprocal distance (src1) enables the dst function to efficiently compute a distance vector.
dest = (1, d*d, d*d,1) * (1, 1/d, 1, 1/d)

dest = (1, d, d*d, 1/d)

https://stackoverflow.com/questions/8525803/what-is-the-hlsl-dst-instruction-for

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think the agorithm correctness is the problem. Its confusing because no one knows what d is. your inputs are Src0 and Src1. You need to put your documentation in those terms.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, that was the point I was making as well. Think about the comment header from the perspective of a random user who is being shown the comment as an editor popup. It needs to be useful in that context

// RUN: %clang_cc1 -finclude-default-header -x hlsl -triple dxil-pc-shadermodel6.2-library %s -fnative-half-type -emit-llvm -disable-llvm-passes -o - | FileCheck %s


// CHECK-LABEL: linkonce_odr noundef nofpclass(nan inf) <4 x float> @_ZN4hlsl8__detail8dst_implIfEEDv4_T_S3_S3_(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are far too specific and will probably break in the future. Please simplify to the minimum thing that you need to check. In this case that is the following:

  • the define keyword
  • the expected type (<4 x float>)
  • the demangled function name (dst_impl)

Something like

// CHECK-LABEL: define {{alpha/space/paren regex}} <4 x float> @{{alpha/num/under regex}}dst_impl{{regex}}(

All the CHECK-LABELs should be updated



// CHECK-LABEL: linkonce_odr noundef nofpclass(nan inf) <4 x float> @_ZN4hlsl8__detail8dst_implIfEEDv4_T_S3_S3_(
// CHECK-SAME: <4 x float> noundef nofpclass(nan inf) [[P:%.*]], <4 x float> noundef nofpclass(nan inf) [[Q:%.*]]) #[[ATTR0:[0-9]+]] {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same idea here. We don't care about noundef nofpclass(nan inf) stuff so the check shouldn't require them

// CHECK-SAME: <4 x float> noundef nofpclass(nan inf) [[P:%.*]], <4 x float> noundef nofpclass(nan inf) [[Q:%.*]]) #[[ATTR0:[0-9]+]] {
// CHECK: [[VECEXT:%.*]] = extractelement <4 x float> [[PADDR:%.*]], i32 1
// CHECK: [[VECEXT1:%.*]] = extractelement <4 x float> [[QADDR:%.*]], i32 1
// CHECK: [[MULRES:%.*]] = fmul reassoc nnan ninf nsz arcp afn float [[VECEXT]], [[VECEXT1]]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not as confident here, but I think it also applies.

We care about fmul, float, vecext, and vecext1 so we should only check those things

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we have been very inconsistent here. I'm fine if this stays but have seen others wild card it out.

@metkarpoonam metkarpoonam requested a review from V-FEXrt April 7, 2025 22:25
/// Return the computed distance vector contain {1, d, d*d, 1/d}

_HLSL_16BIT_AVAILABILITY(shadermodel, 6.2)
const inline vector<half, 4> dst(vector<half, 4> Src0, vector<half, 4> Src1) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We tend to use the more common typedefs unless we're using a template-dependent vector.

Suggested change
const inline vector<half, 4> dst(vector<half, 4> Src0, vector<half, 4> Src1) {
const inline half4 dst(half4 Src0, half4 Src1) {

Comment on lines 196 to 197
const inline vector<float, 4> dst(vector<float, 4> Src0,
vector<float, 4> Src1) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
const inline vector<float, 4> dst(vector<float, 4> Src0,
vector<float, 4> Src1) {
const inline float4 dst(float4 Src0, float4 Src1) {

Comment on lines 201 to 202
const inline vector<double, 4> dst(vector<double, 4> Src0,
vector<double, 4> Src1) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
const inline vector<double, 4> dst(vector<double, 4> Src0,
vector<double, 4> Src1) {
const inline double4 dst(double4 Src0, double4 Src1) {

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the suggestion. I have updated the code to use half4, float4, and double4.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are no code changes to Sema in this PR, so this test is only testing existing functionality that is well tested within clang. We should remove it so as to not increase testing time without adding meaningful test coverage.

Copy link
Contributor Author

@metkarpoonam metkarpoonam Apr 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have removed the dst-error.hlsl file as suggested.

…n, eliminating the redundant variable, and updating the dst.hlsl file accordingly
@V-FEXrt V-FEXrt merged commit 3ef2bb6 into llvm:main Apr 9, 2025
11 checks passed
@llvm-ci
Copy link
Collaborator

llvm-ci commented Apr 9, 2025

LLVM Buildbot has detected a new failure on builder lldb-aarch64-ubuntu running on linaro-lldb-aarch64-ubuntu while building clang at step 6 "test".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/59/builds/15721

Here is the relevant piece of the build log for the reference
Step 6 (test) failure: build (failure)
...
PASS: lldb-unit :: ValueObject/./LLDBValueObjectTests/8/11 (2112 of 2121)
PASS: lldb-unit :: ValueObject/./LLDBValueObjectTests/9/11 (2113 of 2121)
PASS: lldb-unit :: tools/lldb-server/tests/./LLDBServerTests/0/2 (2114 of 2121)
PASS: lldb-unit :: tools/lldb-server/tests/./LLDBServerTests/1/2 (2115 of 2121)
PASS: lldb-unit :: Utility/./UtilityTests/4/9 (2116 of 2121)
PASS: lldb-unit :: Target/./TargetTests/11/14 (2117 of 2121)
PASS: lldb-unit :: Host/./HostTests/9/12 (2118 of 2121)
PASS: lldb-unit :: Host/./HostTests/3/12 (2119 of 2121)
PASS: lldb-unit :: Process/gdb-remote/./ProcessGdbRemoteTests/8/9 (2120 of 2121)
UNRESOLVED: lldb-api :: tools/lldb-server/TestLldbGdbServer.py (2121 of 2121)
******************** TEST 'lldb-api :: tools/lldb-server/TestLldbGdbServer.py' FAILED ********************
Script:
--
/usr/bin/python3.10 /home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/llvm-project/lldb/test/API/dotest.py -u CXXFLAGS -u CFLAGS --env LLVM_LIBS_DIR=/home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/./lib --env LLVM_INCLUDE_DIR=/home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/include --env LLVM_TOOLS_DIR=/home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/./bin --arch aarch64 --build-dir /home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/lldb-test-build.noindex --lldb-module-cache-dir /home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/lldb-test-build.noindex/module-cache-lldb/lldb-api --clang-module-cache-dir /home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/lldb-test-build.noindex/module-cache-clang/lldb-api --executable /home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/./bin/lldb --compiler /home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/./bin/clang --dsymutil /home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/./bin/dsymutil --make /usr/bin/gmake --llvm-tools-dir /home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/./bin --lldb-obj-root /home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/tools/lldb --lldb-libs-dir /home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/./lib /home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/llvm-project/lldb/test/API/tools/lldb-server -p TestLldbGdbServer.py
--
Exit Code: 1

Command Output (stdout):
--
lldb version 21.0.0git (https://github.com/llvm/llvm-project.git revision 3ef2bb6d0bcfcdcfde72d8aa3354e72c2366e22b)
  clang revision 3ef2bb6d0bcfcdcfde72d8aa3354e72c2366e22b
  llvm revision 3ef2bb6d0bcfcdcfde72d8aa3354e72c2366e22b
Skipping the following test categories: ['libc++', 'dsym', 'gmodules', 'debugserver', 'objc']

--
Command Output (stderr):
--
UNSUPPORTED: LLDB (/home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/bin/clang-aarch64) :: test_Hc_then_Csignal_signals_correct_thread_launch_debugserver (TestLldbGdbServer.LldbGdbServerTestCase) (test case does not fall in any category of interest for this run) 
PASS: LLDB (/home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/bin/clang-aarch64) :: test_Hc_then_Csignal_signals_correct_thread_launch_llgs (TestLldbGdbServer.LldbGdbServerTestCase)
PASS: LLDB (/home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/bin/clang-aarch64) :: test_Hg_fails_on_another_pid_llgs (TestLldbGdbServer.LldbGdbServerTestCase)
PASS: LLDB (/home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/bin/clang-aarch64) :: test_Hg_fails_on_minus_one_pid_llgs (TestLldbGdbServer.LldbGdbServerTestCase)
PASS: LLDB (/home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/bin/clang-aarch64) :: test_Hg_fails_on_zero_pid_llgs (TestLldbGdbServer.LldbGdbServerTestCase)
UNSUPPORTED: LLDB (/home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/bin/clang-aarch64) :: test_Hg_switches_to_3_threads_launch_debugserver (TestLldbGdbServer.LldbGdbServerTestCase) (test case does not fall in any category of interest for this run) 
PASS: LLDB (/home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/bin/clang-aarch64) :: test_Hg_switches_to_3_threads_launch_llgs (TestLldbGdbServer.LldbGdbServerTestCase)
UNSUPPORTED: LLDB (/home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/bin/clang-aarch64) :: test_P_and_p_thread_suffix_work_debugserver (TestLldbGdbServer.LldbGdbServerTestCase) (test case does not fall in any category of interest for this run) 
PASS: LLDB (/home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/bin/clang-aarch64) :: test_P_and_p_thread_suffix_work_llgs (TestLldbGdbServer.LldbGdbServerTestCase)
UNSUPPORTED: LLDB (/home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/bin/clang-aarch64) :: test_P_writes_all_gpr_registers_debugserver (TestLldbGdbServer.LldbGdbServerTestCase) (test case does not fall in any category of interest for this run) 
PASS: LLDB (/home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/bin/clang-aarch64) :: test_P_writes_all_gpr_registers_llgs (TestLldbGdbServer.LldbGdbServerTestCase)
UNSUPPORTED: LLDB (/home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/bin/clang-aarch64) :: test_attach_commandline_continue_app_exits_debugserver (TestLldbGdbServer.LldbGdbServerTestCase) (test case does not fall in any category of interest for this run) 
lldb-server exiting...
PASS: LLDB (/home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/bin/clang-aarch64) :: test_attach_commandline_continue_app_exits_llgs (TestLldbGdbServer.LldbGdbServerTestCase)
UNSUPPORTED: LLDB (/home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/bin/clang-aarch64) :: test_c_packet_works_debugserver (TestLldbGdbServer.LldbGdbServerTestCase) (test case does not fall in any category of interest for this run) 
lldb-server exiting...
PASS: LLDB (/home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/bin/clang-aarch64) :: test_c_packet_works_llgs (TestLldbGdbServer.LldbGdbServerTestCase)
UNSUPPORTED: LLDB (/home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/bin/clang-aarch64) :: test_first_launch_stop_reply_thread_matches_first_qC_debugserver (TestLldbGdbServer.LldbGdbServerTestCase) (test case does not fall in any category of interest for this run) 
PASS: LLDB (/home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/bin/clang-aarch64) :: test_first_launch_stop_reply_thread_matches_first_qC_llgs (TestLldbGdbServer.LldbGdbServerTestCase)
UNSUPPORTED: LLDB (/home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/bin/clang-aarch64) :: test_hardware_breakpoint_set_and_remove_work_debugserver (TestLldbGdbServer.LldbGdbServerTestCase) (test case does not fall in any category of interest for this run) 
lldb-server exiting...
PASS: LLDB (/home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/bin/clang-aarch64) :: test_hardware_breakpoint_set_and_remove_work_llgs (TestLldbGdbServer.LldbGdbServerTestCase)

@llvm-ci
Copy link
Collaborator

llvm-ci commented Apr 9, 2025

LLVM Buildbot has detected a new failure on builder llvm-clang-aarch64-darwin running on doug-worker-5 while building clang at step 6 "test-build-unified-tree-check-all".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/190/builds/17984

Here is the relevant piece of the build log for the reference
Step 6 (test-build-unified-tree-check-all) failure: test (failure)
******************** TEST 'LLVM :: ExecutionEngine/OrcLazy/multiple-compile-threads-basic.ll' FAILED ********************
Exit Code: 2

Command Output (stderr):
--
/Users/buildbot/buildbot-root/aarch64-darwin/build/bin/lli -jit-kind=orc-lazy -compile-threads=2 -thread-entry hello /Users/buildbot/buildbot-root/aarch64-darwin/llvm-project/llvm/test/ExecutionEngine/OrcLazy/multiple-compile-threads-basic.ll | /Users/buildbot/buildbot-root/aarch64-darwin/build/bin/FileCheck /Users/buildbot/buildbot-root/aarch64-darwin/llvm-project/llvm/test/ExecutionEngine/OrcLazy/multiple-compile-threads-basic.ll # RUN: at line 1
+ /Users/buildbot/buildbot-root/aarch64-darwin/build/bin/lli -jit-kind=orc-lazy -compile-threads=2 -thread-entry hello /Users/buildbot/buildbot-root/aarch64-darwin/llvm-project/llvm/test/ExecutionEngine/OrcLazy/multiple-compile-threads-basic.ll
+ /Users/buildbot/buildbot-root/aarch64-darwin/build/bin/FileCheck /Users/buildbot/buildbot-root/aarch64-darwin/llvm-project/llvm/test/ExecutionEngine/OrcLazy/multiple-compile-threads-basic.ll
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace.
 #0 0x0000000103d75bf8 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/Users/buildbot/buildbot-root/aarch64-darwin/build/bin/lli+0x100ed5bf8)
 #1 0x0000000103d73c7c llvm::sys::RunSignalHandlers() (/Users/buildbot/buildbot-root/aarch64-darwin/build/bin/lli+0x100ed3c7c)
 #2 0x0000000103d762b4 SignalHandler(int, __siginfo*, void*) (/Users/buildbot/buildbot-root/aarch64-darwin/build/bin/lli+0x100ed62b4)
 #3 0x00000001853ef584 (/usr/lib/system/libsystem_platform.dylib+0x18047b584)
 #4 0x00000001853be21c (/usr/lib/system/libsystem_pthread.dylib+0x18044a21c)
 #5 0x00000001852e4ad0 (/usr/lib/libc++.1.dylib+0x180370ad0)
 #6 0x0000000103917cd0 void llvm::detail::UniqueFunctionBase<void, llvm::Expected<llvm::DenseMap<llvm::orc::SymbolStringPtr, llvm::orc::ExecutorSymbolDef, llvm::DenseMapInfo<llvm::orc::SymbolStringPtr, void>, llvm::detail::DenseMapPair<llvm::orc::SymbolStringPtr, llvm::orc::ExecutorSymbolDef>>>>::CallImpl<llvm::orc::Platform::lookupInitSymbols(llvm::orc::ExecutionSession&, llvm::DenseMap<llvm::orc::JITDylib*, llvm::orc::SymbolLookupSet, llvm::DenseMapInfo<llvm::orc::JITDylib*, void>, llvm::detail::DenseMapPair<llvm::orc::JITDylib*, llvm::orc::SymbolLookupSet>> const&)::$_45>(void*, llvm::Expected<llvm::DenseMap<llvm::orc::SymbolStringPtr, llvm::orc::ExecutorSymbolDef, llvm::DenseMapInfo<llvm::orc::SymbolStringPtr, void>, llvm::detail::DenseMapPair<llvm::orc::SymbolStringPtr, llvm::orc::ExecutorSymbolDef>>>&) (/Users/buildbot/buildbot-root/aarch64-darwin/build/bin/lli+0x100a77cd0)
 #7 0x0000000103913904 llvm::orc::AsynchronousSymbolQuery::handleComplete(llvm::orc::ExecutionSession&)::RunQueryCompleteTask::run() (/Users/buildbot/buildbot-root/aarch64-darwin/build/bin/lli+0x100a73904)
 #8 0x00000001039cfbec void* std::__1::__thread_proxy[abi:un170006]<std::__1::tuple<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct>>, llvm::orc::DynamicThreadPoolTaskDispatcher::dispatch(std::__1::unique_ptr<llvm::orc::Task, std::__1::default_delete<llvm::orc::Task>>)::$_0>>(void*) (/Users/buildbot/buildbot-root/aarch64-darwin/build/bin/lli+0x100b2fbec)
 #9 0x00000001853bef94 (/usr/lib/system/libsystem_pthread.dylib+0x18044af94)
#10 0x00000001853b9d34 (/usr/lib/system/libsystem_pthread.dylib+0x180445d34)
FileCheck error: '<stdin>' is empty.
FileCheck command line:  /Users/buildbot/buildbot-root/aarch64-darwin/build/bin/FileCheck /Users/buildbot/buildbot-root/aarch64-darwin/llvm-project/llvm/test/ExecutionEngine/OrcLazy/multiple-compile-threads-basic.ll

--

********************


AllinLeeYL pushed a commit to AllinLeeYL/llvm-project that referenced this pull request Apr 10, 2025
fixes: llvm#99108
Implement dst algorithm in the hlsl_intrinsics.h and added test cases
for HLSL codegen and sema


- [x] implement dst algorithm in the hlsl_intrinsics.h

- [x] Add HLSL codegen tests to clang/test/CodeGenHLSL/builtins/dst.hlsl

- [x] Add sema tests to clang/test/SemaHLSL/BuiltIns/dst-errors.hlsl
var-const pushed a commit to ldionne/llvm-project that referenced this pull request Apr 17, 2025
fixes: llvm#99108
Implement dst algorithm in the hlsl_intrinsics.h and added test cases
for HLSL codegen and sema


- [x] implement dst algorithm in the hlsl_intrinsics.h

- [x] Add HLSL codegen tests to clang/test/CodeGenHLSL/builtins/dst.hlsl

- [x] Add sema tests to clang/test/SemaHLSL/BuiltIns/dst-errors.hlsl
@damyanp damyanp moved this to Closed in HLSL Support Apr 25, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend:X86 clang:headers Headers provided by Clang, e.g. for intrinsics clang Clang issues not falling into any other category HLSL HLSL Language Support
Projects
Status: Closed
Development

Successfully merging this pull request may close these issues.

Implement the dst HLSL Function
7 participants