Skip to content

Implement the WaveActiveBallot HLSL Function #99163

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
12 tasks
Tracked by #99235
farzonl opened this issue Jul 16, 2024 · 0 comments
Open
12 tasks
Tracked by #99235

Implement the WaveActiveBallot HLSL Function #99163

farzonl opened this issue Jul 16, 2024 · 0 comments
Labels
backend:DirectX backend:SPIR-V bot:HLSL HLSL HLSL Language Support metabug Issue to collect references to a group of similar or related issues.

Comments

@farzonl
Copy link
Member

farzonl commented Jul 16, 2024

  • Implement WaveActiveBallot clang builtin,
  • Link WaveActiveBallot clang builtin with hlsl_intrinsics.h
  • Add sema checks for WaveActiveBallot to CheckHLSLBuiltinFunctionCall in SemaChecking.cpp
  • Add codegen for WaveActiveBallot to EmitHLSLBuiltinExpr in CGBuiltin.cpp
  • Add codegen tests to clang/test/CodeGenHLSL/builtins/WaveActiveBallot.hlsl
  • Add sema tests to clang/test/SemaHLSL/BuiltIns/WaveActiveBallot-errors.hlsl
  • Create the int_dx_WaveActiveBallot intrinsic in IntrinsicsDirectX.td
  • Create the DXILOpMapping of int_dx_WaveActiveBallot to 116 in DXIL.td
  • Create the WaveActiveBallot.ll and WaveActiveBallot_errors.ll tests in llvm/test/CodeGen/DirectX/
  • Create the int_spv_WaveActiveBallot intrinsic in IntrinsicsSPIRV.td
  • In SPIRVInstructionSelector.cpp create the WaveActiveBallot lowering and map it to int_spv_WaveActiveBallot in SPIRVInstructionSelector::selectIntrinsic.
  • Create SPIR-V backend test case in llvm/test/CodeGen/SPIRV/hlsl-intrinsics/WaveActiveBallot.ll

DirectX

DXIL Opcode DXIL OpName Shader Model Shader Stages
116 WaveActiveBallot 6.0 ('library', 'compute', 'amplification', 'mesh', 'pixel', 'vertex', 'hull', 'domain', 'geometry', 'raygeneration', 'intersection', 'anyhit', 'closesthit', 'miss', 'callable', 'node')

SPIR-V

OpGroupNonUniformBallot:

Description:

Result is a bitfield value combining the Predicate value from all
invocations in the group that execute the same dynamic
instance of this instruction. The bit is set to one if the corresponding
invocation is active and the Predicate for that invocation evaluated
to true; otherwise, it is set to zero.

Result Type must be a vector of four components of integer
type
scalar, whose Width operand is 32 and whose
Signedness operand is 0.

Result is a set of bitfields where the first invocation is represented
in the lowest bit of the first vector component and the last (up to the
size of the group) is the higher bit number of the last bitmask needed
to represent all bits of the group invocations.

Execution is a Scope that identifies the group of
invocations affected by this command.

Predicate must be a Boolean type.

Capability:
GroupNonUniformBallot

Missing before version 1.3.

Word Count Opcode Results Operands

5

339

<id>
Result Type

Result <id>

Scope <id>
Execution

<id>
Predicate

Test Case(s)

Example 1

//dxc WaveActiveBallot_test.hlsl -T lib_6_8 -enable-16bit-types -O0

export uint4 fn(bool p1) {
    return WaveActiveBallot(p1);
}

HLSL:

Returns a uint4 containing a bitmask of the evaluation of the Boolean expression for all active lanes in the current wave.

Syntax

uint4 WaveActiveBallot(
   bool expr
);

Parameters

expr

The boolean expression to evaluate.

Return value

A uint4 containing a bitmask of the evaluation of the Boolean expression for all active lanes in the current wave. The least-significant bit corresponds to the lane with index zero. The bits corresponding to inactive lanes will be zero. The bits that are greater than or equal to WaveGetLaneCount will be zero.

Remarks

Different GPUs have different SIMD processor widths (lane counts). Most of these WaveXXX functions are able to operate at level of abstraction where SIMD machine width is concealed. To maximize portability of code across GPUs, use the intrinsics that don’t rely on machine width. For example, use:

uint result = WaveActiveCountBits( bBit );

Instead of:

uint result = countbits( WaveActiveBallot( bBit ) );

This function is supported from shader model 6.0 in all shader stages.

 

Examples

// get a bitwise representation of the number of currently active lanes:
uint4 waveBits = WaveActiveBallot( true ); // convert to bits 

See also

Overview of Shader Model 6

Shader Model 6

@farzonl farzonl added backend:DirectX backend:SPIR-V bot:HLSL HLSL HLSL Language Support metabug Issue to collect references to a group of similar or related issues. labels Jul 16, 2024
@damyanp damyanp moved this to Ready in HLSL Support Oct 30, 2024
@damyanp damyanp moved this from Ready to Planning in HLSL Support Oct 30, 2024
@damyanp damyanp moved this from Planning to Active in HLSL Support Nov 26, 2024
@damyanp damyanp moved this from Active to Ready in HLSL Support Apr 25, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend:DirectX backend:SPIR-V bot:HLSL HLSL HLSL Language Support metabug Issue to collect references to a group of similar or related issues.
Projects
Status: Ready
Development

No branches or pull requests

1 participant