Description
- Implement
WaveActiveSum
clang builtin, - Link
WaveActiveSum
clang builtin withhlsl_intrinsics.h
- Add sema checks for
WaveActiveSum
toCheckHLSLBuiltinFunctionCall
inSemaChecking.cpp
- Add codegen for
WaveActiveSum
toEmitHLSLBuiltinExpr
inCGBuiltin.cpp
- Add codegen tests to
clang/test/CodeGenHLSL/builtins/WaveActiveSum.hlsl
- Add sema tests to
clang/test/SemaHLSL/BuiltIns/WaveActiveSum-errors.hlsl
- Create the
int_dx_WaveActiveSum
intrinsic inIntrinsicsDirectX.td
- Create the
DXILOpMapping
ofint_dx_WaveActiveSum
to119
inDXIL.td
- Create the
WaveActiveSum.ll
andWaveActiveSum_errors.ll
tests inllvm/test/CodeGen/DirectX/
- Create the
int_spv_WaveActiveSum
intrinsic inIntrinsicsSPIRV.td
- In SPIRVInstructionSelector.cpp create the
WaveActiveSum
lowering and map it toint_spv_WaveActiveSum
inSPIRVInstructionSelector::selectIntrinsic
. - Create SPIR-V backend test case in
llvm/test/CodeGen/SPIRV/hlsl-intrinsics/WaveActiveSum.ll
DirectX
DXIL Opcode | DXIL OpName | Shader Model | Shader Stages |
---|---|---|---|
119 | WaveActiveOp | 6.0 | ('library', 'compute', 'amplification', 'mesh', 'pixel', 'vertex', 'hull', 'domain', 'geometry', 'raygeneration', 'intersection', 'anyhit', 'closesthit', 'miss', 'callable', 'node') |
SPIR-V
OpGroupNonUniformFAdd:
Description:
A floating point add group operation of all Value
operands contributed by active invocations in the
group.
Result Type must be a scalar or vector of floating-point
type.
Execution is a Scope that identifies the group of
invocations affected by this command. It must be Subgroup.
The identity I for Operation is 0. If Operation is
ClusteredReduce, ClusterSize must be present.
The type of Value must be the same as Result Type. The method used
to perform the group operation on the contributed Value(s) from active
invocations is implementation defined.
ClusterSize is the size of cluster to use. ClusterSize must be a
scalar of integer type, whose Signedness operand is 0.
ClusterSize must come from a constant
instruction. Behavior is undefined unless
ClusterSize is at least 1 and a power of 2. If ClusterSize is
greater than the size of the group, executing this instruction
results in undefined behavior.
Capability:
GroupNonUniformArithmetic, GroupNonUniformClustered,
GroupNonUniformPartitionedNV
Missing before version 1.3.
Word Count | Opcode | Results | Operands | ||||
---|---|---|---|---|---|---|---|
6 + variable |
350 |
<id> |
Scope <id> |
Group Operation |
<id> |
Optional |
Test Case(s)
Example 1
//dxc WaveActiveSum_test.hlsl -T lib_6_8 -E fn -enable-16bit-types -spirv -fspv-target-env=universal1.5 -fcgl -O0
export float4 fn(float4 p1) {
return WaveActiveSum(p1);
}
Example 2
//dxc WaveActiveSum_1_test.hlsl -T lib_6_8 -enable-16bit-types -O0
export uint4 fn(uint4 p1) {
return WaveActiveSum(p1);
}
Example 3
//dxc WaveActiveSum_2_test.hlsl -T lib_6_8 -enable-16bit-types -O0
export int4 fn(int4 p1) {
return WaveActiveSum(p1);
}
HLSL:
Sums up the value of the expression across all active lanes in the current wave and replicates it to all lanes in the current wave.
Syntax
<type> WaveActiveSum(
<type> expr
);
Parameters
-
expr
-
The expression to evaluate.
Return value
The sum value.
Remarks
The order of operations is undefined.
This function is supported from shader model 6.0 in all shader stages.
Examples
float3 total = WaveActiveSum( position ); // sum positions in wave
float3 center = total/count; // compute average of these positions
See also
Metadata
Metadata
Assignees
Labels
Type
Projects
Status