-
Notifications
You must be signed in to change notification settings - Fork 13.4k
[HLSL] Propose and discuss a holistic approach to math operations in HLSL #87367
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I took a look through the HLSL Intrinsics, HLSL 6.0 Intrinsics, HLSL 6.4 Intrinsics, and DXIL Ops and filtered down to the simple math operations manually. From there we see a few things. There are 26 operations that map trivially to existing LLVM intrinsics. Note that three of these don't have direct HLSL intrinsics - IsNormal seems to be unused, and UAddc/USubb are only used by expansions of 64 bit math in dxc. Nothing needs to be done for these.
There are 17 HLSL operations that map directly to C/C++ standard math functions, though only 7 of these have DXIL operations. We should definitely push for adding the 7 DXIL ops as llvm intrinsics, and we should probably just get all of these in.
We also have 4 common math operations that don't map to the C standard library, but can be awkward to pattern match on if we lower early. We should consider asking for these as well.
Then there are mad and sincos. I personally don't think these are worth having intrinsics for, as separating them into two operations feels like it won't really lose anything. Note that these are MAD not FMA, so they aren't fused at all.
We also have a few bitfield instructions. We should probably pursue getting generic intrinsics for these, as a lot of architectures have this kind of instruction. However, I think it falls outside of the "this is just normal math" umbrella and we should do that separately.
Next we have operations that seem like they should map simply to intrinsics, but they're defined to return 0xffffffff instead of poison in edge cases. I don't really know what to do with these, but we clearly can't define them as generic intrinsics.
After that we're left with some highly specific operations. I can't see implementing any of these generically, though saturate could pretty trivially be implemented in terms of clamp if we add that one.
Finally, there are some utility functions that dxc lowers early. I suspect we should simply do the same and there isn't any value kicking them around as intrinsics.
Conclusions
|
First Thanks @bogner for putting this list together. I think it's a good starting point for discussion. There are a few corrections We need to make in the backend That this list has made clear. There are others that this list isn't correct on. First we don't need to propose any of the below they already exist: Second, I don't agree with your Third DIXL isn't the only backend we would want to consider for our proposal. Part of why we want to make this proposal is because the more math intrinsics exist in LLVM the easier it will be to support the SPIRV backend. %12 = OpConstantComposite %v4float %float_0 %float_0 %float_0 %float_0
%34 = OpFOrdNotEqual %v4bool %32 %12
%35 = OpAny %bool %34
%38 = OpFOrdNotEqual %v4bool %37 %12
%39 = OpAll %bool %38 degrees and radians map to GLSL extensions %13 = OpExtInst %v4float %1 Degrees %12
%14 = OpExtInst %v4float %1 Radians %12
%15 = OpExtInst %v4float %1 Step %12 %12 # In your conclusions you had %12 = OpCompositeExtract %int %11 0
%13 = OpIMul %int %12 %12
%14 = OpCompositeExtract %int %11 1
%15 = OpIMul %int %14 %14
%16 = OpCompositeExtract %int %11 2
%17 = OpIMul %int %16 %16
%18 = OpCompositeExtract %int %11 3
%19 = OpIMul %int %18 %18
%20 = OpIAdd %int %13 %15
%21 = OpIAdd %int %20 %17
%22 = OpIAdd %int %21 %19
%23 = OpCompositeConstruct %v4int %22 %22 %22 %22 So my conclusion is you only need 13 of your math intrinsics (-4). Does |
Started an RFC here: https://discourse.llvm.org/t/rfc-all-the-math-intrinsics/78294 |
|
This change is an implementation of #87367's investigation on supporting IEEE math operations as intrinsics. Which was discussed in this RFC: https://discourse.llvm.org/t/rfc-all-the-math-intrinsics/78294 If you want an overarching view of how this will all connect see: #90088 Changes: - `clang/docs/LanguageExtensions.rst` - Document the new elementwise tan builtin. - `clang/include/clang/Basic/Builtins.td` - Implement the tan builtin. - `clang/lib/CodeGen/CGBuiltin.cpp` - invoke the tan intrinsic on uses of the builtin - `clang/lib/Headers/hlsl/hlsl_intrinsics.h` - Associate the tan builtin with the equivalent hlsl apis - `clang/lib/Sema/SemaChecking.cpp` - Add generic sema checks as well as HLSL specifc sema checks to the tan builtin - `llvm/include/llvm/IR/Intrinsics.td` - Create the tan intrinsic - `llvm/docs/LangRef.rst` - Document the tan intrinsic
This change is an implementation of #87367's investigation on supporting IEEE math operations as intrinsics. Which was discussed in this RFC: https://discourse.llvm.org/t/rfc-all-the-math-intrinsics/78294 If you want an overarching view of how this will all connect see: #90088 Changes: - `llvm/include/llvm/IR/Intrinsics.td` - Create the tan intrinsic - `llvm/lib/Target/DirectX/DXIL.td` - Map `int_tan` (the tan intrinsic) to the equivalent DXIL Op.
This change is an implementation of #87367's investigation on supporting IEEE math operations as intrinsics. Which was discussed in this RFC: https://discourse.llvm.org/t/rfc-all-the-math-intrinsics/78294 If you want an overarching view of how this will all connect see: #90088 Changes: - `llvm/docs/GlobalISel/GenericOpcode.rst` - Document the `G_FTAN` opcode - `llvm/include/llvm/IR/Intrinsics.td` - Create the tan intrinsic - `llvm/include/llvm/Support/TargetOpcodes.def` - Create a `G_FTAN` Opcode handler - `llvm/include/llvm/Target/GenericOpcodes.td` - Define the `G_FTAN` Opcode - `llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp` Map the tan intrinsic to `G_FTAN` Opcode - `llvm/lib/Target/SPIRV/SPIRVInstructionSelector.cpp` - Map the `G_FTAN` opcode to the GLSL 4.5 and openCL tan instructions. - `llvm/lib/Target/SPIRV/SPIRVLegalizerInfo.cpp` - Define `G_FTAN` as a legal spirv target opcode.
This change is an implementation of #87367's investigation on supporting IEEE math operations as intrinsics. Which was discussed in this RFC: https://discourse.llvm.org/t/rfc-all-the-math-intrinsics/78294 Much of this change was following how G_FSIN and G_FCOS were used. Changes: - `llvm/docs/GlobalISel/GenericOpcode.rst` - Document the `G_FTAN` opcode - `llvm/docs/LangRef.rst` - Document the tan intrinsic - `llvm/include/llvm/Analysis/VecFuncs.def` - Associate the tan intrinsic as a vector function similar to the tanf libcall. - `llvm/include/llvm/CodeGen/BasicTTIImpl.h` - Map the tan intrinsic to `ISD::FTAN` - `llvm/include/llvm/CodeGen/ISDOpcodes.h` - Define ISD opcodes for `FTAN` and `STRICT_FTAN` - `llvm/include/llvm/IR/Intrinsics.td` - Create the tan intrinsic - `llvm/include/llvm/IR/RuntimeLibcalls.def` - Define tan libcall mappings - `llvm/include/llvm/Target/GenericOpcodes.td` - Define the `G_FTAN` Opcode - `llvm/include/llvm/Support/TargetOpcodes.def` - Create a `G_FTAN` Opcode handler - `llvm/include/llvm/Target/GlobalISel/SelectionDAGCompat.td` - Map `G_FTAN` to `ftan` - `llvm/include/llvm/Target/TargetSelectionDAG.td` - Define `ftan`, `strict_ftan`, and `any_ftan` and map them to the ISD opcodes for `FTAN` and `STRICT_FTAN` - `llvm/lib/Analysis/VectorUtils.cpp` - Associate the tan intrinsic as a vector intrinsic - `llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp` Map the tan intrinsic to `G_FTAN` Opcode - `llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp` - Add `G_FTAN` to the list of floating point math operations also associate `G_FTAN` with the `TAN_F` runtime lib. - `llvm/lib/CodeGen/GlobalISel/Utils.cpp` - More floating point math operation common behaviors. - llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp - List the function expansion operations for `FTAN` and `STRICT_FTAN`. Also define both opcodes in `PromoteNode`. - `llvm/lib/CodeGen/SelectionDAG/LegalizeFloatTypes.cpp` - More `FTAN` and `STRICT_FTAN` handling in the legalizer - `llvm/lib/CodeGen/SelectionDAG/LegalizeTypes.h` - Define `SoftenFloatRes_FTAN` and `ExpandFloatRes_FTAN`. - `llvm/lib/CodeGen/SelectionDAG/LegalizeVectorOps.cpp` - Define `FTAN` as a legal vector operation. - `llvm/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp` - Define `FTAN` as a legal vector operation. - `llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp` - define tan as an intrinsic that doesn't return NaN. - `llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp` Map `LibFunc_tan`, `LibFunc_tanf`, and `LibFunc_tanl` to `ISD::FTAN`. Map `Intrinsic::tan` to `ISD::FTAN` and add selection dag handling for `Intrinsic::tan`. - `llvm/lib/CodeGen/SelectionDAG/SelectionDAGDumper.cpp` - Define `ftan` and `strict_ftan` names for the equivalent ISD opcodes. - `llvm/lib/CodeGen/TargetLoweringBase.cpp` -Define a Tan128 libcall and ISD::FTAN as a target lowering action. - `llvm/lib/Target/X86/X86ISelLowering.cpp` - Add x86_64 lowering for tan intrinsic resolves #70082
This change is an implementation of #87367 investigation on supporting IEEE math operations as intrinsics. Which was discussed in this RFC: https://discourse.llvm.org/t/rfc-all-the-math-intrinsics/78294 This PR is just for Tan. Now that x86 tan backend landed: #90503 we can add other backends since the shared pieces are in tree now. Changes: - `llvm/include/llvm/Analysis/VecFuncs.def` - vectorization of tan for arm64 backends. - `llvm/lib/Target/AArch64/AArch64FastISel.cpp` - Add tan to the libcall table - `llvm/lib/Target/AArch64/AArch64ISelLowering.cpp` - Add tan expansion for f128, f16, and vector\neon operations - `llvm/lib/Target/AArch64/GISel/AArch64LegalizerInfo.cpp` define `G_FTAN` as a legal arm64 instruction resolves #94755
This change is an implementation of #87367 investigation on supporting IEEE math operations as intrinsics. Which was discussed in this RFC: https://discourse.llvm.org/t/rfc-all-the-math-intrinsics/78294 This change adds constraint intrinsics and some lowering cases for `acos`, `asin`, `atan`, `cosh`, `sinh`, and `tanh`. The only x86 specific change was for f80. #70079 #70080 #70081 #70083 #70084 #95966 The x86 lowering is going to be done in three pr changes with this being the first. A second PR will be put up for Loop Vectorizing and then SLPVectorizer. The constraint intrinsics is also going to be in multiple parts, but just 2. This part covers just the llvm specific changes, part2 will cover clang specifc changes and legalization for backends than have special legalization requirements like aarch64 and wasm.
This change is an implementation of llvm#87367 investigation on supporting IEEE math operations as intrinsics. Which was discussed in this RFC: https://discourse.llvm.org/t/rfc-all-the-math-intrinsics/78294 This change adds constraint intrinsics and some lowering cases for `acos`, `asin`, `atan`, `cosh`, `sinh`, and `tanh`. The only x86 specific change was for f80. llvm#70079 llvm#70080 llvm#70081 llvm#70083 llvm#70084 llvm#95966 The x86 lowering is going to be done in three pr changes with this being the first. A second PR will be put up for Loop Vectorizing and then SLPVectorizer. The constraint intrinsics is also going to be in multiple parts, but just 2. This part covers just the llvm specific changes, part2 will cover clang specifc changes and legalization for backends than have special legalization requirements like aarch64 and wasm.
…builtins - `Builtins.td` - Add f16 support for libm arc and hyperbolic trig functions - `CGBuiltin.cpp` - Emit constraint intrinsics for trig clang builtins This change is part of an implementation of llvm#87367 investigation on supporting IEEE math operations as intrinsics. Which was discussed in this RFC: https://discourse.llvm.org/t/rfc-all-the-math-intrinsics/78294 This change adds wasm lowering cases for `acos`, `asin`, `atan`, `cosh`, `sinh`, and `tanh`. llvm#70079 llvm#70080 llvm#70081 llvm#70083 llvm#70084 llvm#95966 Note this PR needs Merge after: - llvm#98937 - llvm#98755
## The change(s) - `VecFuncs.def`: define intrinsic to sleef/armpl mapping - `LegalizerHelper.cpp`: add missing `fewerElementsVector` handling for the new trig intrinsics - `AArch64ISelLowering.cpp`: Add arch64 specializations for lowering like neon instructions - `AArch64LegalizerInfo.cpp`: Legalize the new trig intrinsics. aarch64 has specail legalization requirments in `AArch64LegalizerInfo.cpp`. If we redirect the clang builtin without handling this we will break the aarch64 compiler ## History This change is part of an implementation of #87367 investigation on supporting IEEE math operations as intrinsics. Which was discussed in this RFC: https://discourse.llvm.org/t/rfc-all-the-math-intrinsics/78294 This change adds wasm lowering cases for `acos`, `asin`, `atan`, `cosh`, `sinh`, and `tanh`. #70079 #70080 #70081 #70083 #70084 #95966 ## Why is aarch64 needed The last step is to redirect the `acos`, `asin`, `atan`, `cosh`, `sinh`, and `tanh` to emit the intrinsic. We can't emit the intrinsic without the intrinsics becoming legal for aarch64 in `AArch64LegalizerInfo.cpp`
…#98755) ## Change: - WebAssemblyRuntimeLibcallSignatures.cpp: Expose the RTLIB's for use by WASM - Add trig specific test cases ## History This change is part of an implementation of #87367 investigation on supporting IEEE math operations as intrinsics. Which was discussed in this RFC: https://discourse.llvm.org/t/rfc-all-the-math-intrinsics/78294 This change adds wasm lowering cases for `acos`, `asin`, `atan`, `cosh`, `sinh`, and `tanh`. #70079 #70080 #70081 #70083 #70084 #95966 ## Why Web Assembly? From past changes to try and support constraint intrinsics the changes to the trig builtins to emit intrinsics\constraint intrinsics broke the WASM build. This is an attempt to preempt any such build break. - #95082 - #94559 (comment)
…builtins (#98949) ## Change(s) - `Builtins.td` - Add f16 support for libm arc and hyperbolic trig functions - `CGBuiltin.cpp` - Emit constraint intrinsics for trig clang builtins ## History This change is part of an implementation of #87367 investigation on supporting IEEE math operations as intrinsics. Which was discussed in this RFC: https://discourse.llvm.org/t/rfc-all-the-math-intrinsics/78294 This change adds wasm lowering cases for `acos`, `asin`, `atan`, `cosh`, `sinh`, and `tanh`. #70079 #70080 #70081 #70083 #70084 #95966 ## Precursor PR(s) Note this PR needs Merge after: - #98937 - #98755
Summary: ## The change(s) - `VecFuncs.def`: define intrinsic to sleef/armpl mapping - `LegalizerHelper.cpp`: add missing `fewerElementsVector` handling for the new trig intrinsics - `AArch64ISelLowering.cpp`: Add arch64 specializations for lowering like neon instructions - `AArch64LegalizerInfo.cpp`: Legalize the new trig intrinsics. aarch64 has specail legalization requirments in `AArch64LegalizerInfo.cpp`. If we redirect the clang builtin without handling this we will break the aarch64 compiler ## History This change is part of an implementation of #87367 investigation on supporting IEEE math operations as intrinsics. Which was discussed in this RFC: https://discourse.llvm.org/t/rfc-all-the-math-intrinsics/78294 This change adds wasm lowering cases for `acos`, `asin`, `atan`, `cosh`, `sinh`, and `tanh`. #70079 #70080 #70081 #70083 #70084 #95966 ## Why is aarch64 needed The last step is to redirect the `acos`, `asin`, `atan`, `cosh`, `sinh`, and `tanh` to emit the intrinsic. We can't emit the intrinsic without the intrinsics becoming legal for aarch64 in `AArch64LegalizerInfo.cpp` Test Plan: Reviewers: Subscribers: Tasks: Tags: Differential Revision: https://phabricator.intern.facebook.com/D60251359
…#98755) ## Change: - WebAssemblyRuntimeLibcallSignatures.cpp: Expose the RTLIB's for use by WASM - Add trig specific test cases ## History This change is part of an implementation of #87367 investigation on supporting IEEE math operations as intrinsics. Which was discussed in this RFC: https://discourse.llvm.org/t/rfc-all-the-math-intrinsics/78294 This change adds wasm lowering cases for `acos`, `asin`, `atan`, `cosh`, `sinh`, and `tanh`. #70079 #70080 #70081 #70083 #70084 #95966 ## Why Web Assembly? From past changes to try and support constraint intrinsics the changes to the trig builtins to emit intrinsics\constraint intrinsics broke the WASM build. This is an attempt to preempt any such build break. - #95082 - #94559 (comment)
…builtins (#98949) Summary: ## Change(s) - `Builtins.td` - Add f16 support for libm arc and hyperbolic trig functions - `CGBuiltin.cpp` - Emit constraint intrinsics for trig clang builtins ## History This change is part of an implementation of #87367 investigation on supporting IEEE math operations as intrinsics. Which was discussed in this RFC: https://discourse.llvm.org/t/rfc-all-the-math-intrinsics/78294 This change adds wasm lowering cases for `acos`, `asin`, `atan`, `cosh`, `sinh`, and `tanh`. #70079 #70080 #70081 #70083 #70084 #95966 ## Precursor PR(s) Note this PR needs Merge after: - #98937 - #98755 Test Plan: Reviewers: Subscribers: Tasks: Tags: Differential Revision: https://phabricator.intern.facebook.com/D60251574
#99383) This change is part 2 x86 Loop Vectorization of : #96222 It also has veclib call loop vectorization hence the test cases in `llvm/test/Transforms/LoopVectorize/X86/veclib-calls.ll` finally the last pr missed tests for `llvm/test/CodeGen/X86/fp-strict-libcalls-msvc32.ll` and `llvm/test/CodeGen/X86/vec-libcalls.ll` so added those aswell. No evidence was found for arc and hyperbolic trig glibc vector math functions https://github.com/lattera/glibc/blob/master/sysdeps/x86/fpu/bits/math-vector.h so no new `_ZGVbN2v_*` and `_ZGVdN4v_*` . So no new tests in `llvm/test/Transforms/LoopVectorize/X86/libm-vector-calls-VF2-VF8.ll` Also no new svml and no new tests to: `llvm/test/Transforms/LoopVectorize/X86/svml-calls.ll` There was not enough evidence that there were svml arc and hyperbolic trig vector implementations, Documentation was scarces so looked at test cases in [numpy](https://github.com/numpy/SVML/blob/32bf2a98420762a63ab418aaa0a7d6e17eb9627a/linux/avx512/svml_z0_acos_d_la.s#L8). Someone with more experience with svml should investigate. ## Note amd libm doesn't have a vector hyperbolic sine api hence why youi might notice there are no tests for `sinh`. ## History This change is part of #87367 investigation on supporting IEEE math operations as intrinsics. Which was discussed in this RFC: https://discourse.llvm.org/t/rfc-all-the-math-intrinsics/78294 This change adds loop vectorization for `acos`, `asin`, `atan`, `cosh`, `sinh`, and `tanh`. resolves #70079 resolves #70080 resolves #70081 resolves #70083 resolves #70084 resolves #95966
From @farzonl's investigation of #83882 it's become clear that splitting up HLSL math operations between generic LLVM intrinsics and special cased HLSL intrinsics is pretty unfortunate. We would prefer to avoid having to implement a large chunk of generic math operations in our own playground, and instead handle them generically in LLVM.
To do this, we should propose implementing the set of math intrinsics we're interested in as generic LLVM intrinsics, and work with the community to get to a solution that makes sense across the board.
A couple of things to note:
AC: This work item tracks sending the initial RFC and driving the conversation to an actionable conclusion.
The text was updated successfully, but these errors were encountered: