Skip to content

Implement the WavePrefixCountBits HLSL Function #99171

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
12 tasks
Tracked by #99235
farzonl opened this issue Jul 16, 2024 · 0 comments
Open
12 tasks
Tracked by #99235

Implement the WavePrefixCountBits HLSL Function #99171

farzonl opened this issue Jul 16, 2024 · 0 comments
Labels
backend:DirectX backend:SPIR-V bot:HLSL HLSL HLSL Language Support metabug Issue to collect references to a group of similar or related issues.

Comments

@farzonl
Copy link
Member

farzonl commented Jul 16, 2024

  • Implement WavePrefixCountBits clang builtin,
  • Link WavePrefixCountBits clang builtin with hlsl_intrinsics.h
  • Add sema checks for WavePrefixCountBits to CheckHLSLBuiltinFunctionCall in SemaChecking.cpp
  • Add codegen for WavePrefixCountBits to EmitHLSLBuiltinExpr in CGBuiltin.cpp
  • Add codegen tests to clang/test/CodeGenHLSL/builtins/WavePrefixCountBits.hlsl
  • Add sema tests to clang/test/SemaHLSL/BuiltIns/WavePrefixCountBits-errors.hlsl
  • Create the int_dx_WavePrefixCountBits intrinsic in IntrinsicsDirectX.td
  • Create the DXILOpMapping of int_dx_WavePrefixCountBits to 136 in DXIL.td
  • Create the WavePrefixCountBits.ll and WavePrefixCountBits_errors.ll tests in llvm/test/CodeGen/DirectX/
  • Create the int_spv_WavePrefixCountBits intrinsic in IntrinsicsSPIRV.td
  • In SPIRVInstructionSelector.cpp create the WavePrefixCountBits lowering and map it to int_spv_WavePrefixCountBits in SPIRVInstructionSelector::selectIntrinsic.
  • Create SPIR-V backend test case in llvm/test/CodeGen/SPIRV/hlsl-intrinsics/WavePrefixCountBits.ll

DirectX

DXIL Opcode DXIL OpName Shader Model Shader Stages
136 WavePrefixBitCount 6.0 ('library', 'compute', 'amplification', 'mesh', 'pixel', 'vertex', 'hull', 'domain', 'geometry', 'raygeneration', 'intersection', 'anyhit', 'closesthit', 'miss', 'callable', 'node')

SPIR-V

OpGroupNonUniformBallotBitCount:

Description:

Result is the number of bits that are set to 1 in Value, considering
only the bits in Value required to represent all bits of the
group's invocations.

Result Type must be a scalar of integer type, whose
Signedness operand is 0.

Execution is a Scope that identifies the group of
invocations affected by this command. It must be Subgroup.

The identity I for Operation is 0.

Value must be a vector of four components of integer
type
scalar, whose Width operand is 32 and whose
Signedness operand is 0.

Value is a set of bitfields where the first invocation is represented
in the lowest bit of the first vector component and the last (up to the
size of the group) is the higher bit number of the last bitmask needed
to represent all bits of the group invocations.

Capability:
GroupNonUniformBallot

Missing before version 1.3.

Word Count Opcode Results Operands

6

342

<id>
Result Type

Result <id>

Scope <id>
Execution

Group Operation
Operation

<id>
Value

Test Case(s)

Example 1

//dxc WavePrefixCountBits_test.hlsl -T lib_6_8 -enable-16bit-types -O0

export uint fn(bool p1) {
    return WavePrefixCountBits(p1);
}

HLSL:

Returns the sum of all the specified boolean variables set to true across all active lanes with indices smaller than the current lane.

Syntax

uint WavePrefixCountBits(
   bool bBit
);

Parameters

bBit

The specified boolean variables.

Return value

The sum of all the specified Boolean variables set to true across all active lanes with indices smaller than the current lane.

Remarks

This function is supported from shader model 6.0 in all shader stages.

 

Examples

The following code describes how to implement a compacted write to an ordered stream where the number of elements written per lane is either 1 or 0.

bool bDoesThisLaneHaveAnAppendItem = <expr>;
// compute number of items to append for the whole wave
uint laneAppendOffset = WavePrefixCountBits( bDoesThisLaneHaveAnAppendItem );
uint appendCount = WaveActiveCountBits( bDoesThisLaneHaveAnAppendItem);
// update the output location for this whole wave
uint appendOffset;
if ( WaveIsFirstLane () )
{
    // this way, we only issue one atomic for the entire wave, which reduces contention
    // and keeps the output data for each lane in this wave together in the output buffer
    InterlockedAdd(bufferSize, appendCount, appendOffset);
}
appendOffset = WaveReadLaneFirst( appendOffset ); // broadcast value
appendOffset += laneAppendOffset; // and add in the offset for this lane
buffer[appendOffset] = myData; // write to the offset location for this lane

See also

Overview of Shader Model 6

Shader Model 6

@farzonl farzonl added backend:DirectX backend:SPIR-V bot:HLSL HLSL HLSL Language Support metabug Issue to collect references to a group of similar or related issues. labels Jul 16, 2024
@damyanp damyanp moved this to Ready in HLSL Support Oct 30, 2024
@damyanp damyanp moved this from Ready to Planning in HLSL Support Oct 30, 2024
@damyanp damyanp moved this from Planning to Ready in HLSL Support Nov 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend:DirectX backend:SPIR-V bot:HLSL HLSL HLSL Language Support metabug Issue to collect references to a group of similar or related issues.
Projects
Status: Ready
Development

No branches or pull requests

1 participant