-
Notifications
You must be signed in to change notification settings - Fork 13.5k
[DirectX][DXIL] Design document for TableGen Spec of DXIL Operations #85170
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@llvm/pr-subscribers-backend-directx Author: S. Bharadwaj Yadavalli (bharadwajy) ChangesAdd a design document that outlines the choices considered for TableGen specification of DXIL Operations and current implementation. Marking as Draft PR to solicit feedback. Full diff: https://github.com/llvm/llvm-project/pull/85170.diff 1 Files Affected:
diff --git a/llvm/docs/DirectX/DXILOpTableGenDesign.rst b/llvm/docs/DirectX/DXILOpTableGenDesign.rst
new file mode 100644
index 00000000000000..55f4b8904518f1
--- /dev/null
+++ b/llvm/docs/DirectX/DXILOpTableGenDesign.rst
@@ -0,0 +1,207 @@
+==============================================================
+Specification of DXIL Operations using TableGen Representation
+==============================================================
+.. contents::
+ :local:
+
+.. toctree
+ :hidden
+
+Introduction
+============
+
+`DirectXShaderCompiler <https://github.com/microsoft/DirectXShaderCompiler>`_
+encapsulates, among other information, various DXIL Operations in
+`hctdb.py <https://github.com/microsoft/DirectXShaderCompiler/blob/main/utils/hct/hctdb.py>`_.
+DXIL Operations are represented in one of the following `two ways
+<https://github.com/microsoft/DirectXShaderCompiler/blob/130877392c263888ef06bab768856d3dab1f1c9a/docs/DXIL.rst#L1978>`_:
+
+#. Using LLVM instructions
+#. Using LLVM External functions. These are represented in LLVM IR as follows:
+
+ * "Standard" LLVM intrinsics (e.g., ``llvm.sin.*``) and
+ * HLSL intrinsics (defined as LLVM intrinsics in ``llvm/include/llvm/IR/IntrinsicsDirectX.td``, e.g., ``llvm.dx.*``)
+
+ These are collectively referred to as `LLVM Intrinsics` in this note.
+
+DXIL Ops, as currently represented in ``hctdb.py`` have the following attributes
+
+#. ``name`` - A short, unique name
+#. ``llvm_id`` - ID of LLVM instruction. This is just an arbitrary, yet fixed, number that indicates LLVM's ``CallInst`` for all LLVM intrinsics
+#. ``llvm_name`` - String name of LLVM instruction type
+#. ``is_dxil_op`` - A bool indicating whether this is a call into a built-in DXIL function
+#. ``dxil_op`` - String name of DXIL operation
+#. ``dxil_opid`` - ID of DXIL operation
+#. ``dxil_class`` - String name of the opcode class
+#. ``category`` - String classification for this instruction
+#. ``doc`` - String documentation description of this instruction
+#. ``remarks`` - String long-form remarks on this instruction
+#. ``ops`` - List of operands that this instruction takes
+#. ``is_allowed`` - Bool indicating whether this instruction is allowed in a DXIL program
+#. ``oload_types`` - String denoting overload types if applicable (e.g., "hf", "iwl")
+#. ``fn_attr`` - Attribute shorthand strings: rn=does not access memory,ro=only reads from memory,
+#. ``is_deriv`` - Bool indicating whether this is some kind of derivative
+#. ``is_gradient`` - Bool indicating whether this requires a gradient calculation
+#. ``is_feedback`` - Bool indicating whether this is a sampler feedback op
+#. ``is_wave`` - Bool indicating whether this requires in-wave, cross-lane functionality
+#. ``requires_uniform_inputs`` - Bool indicating whether this operation requires that all of its inputs are uniform across the wave
+#. ``is_barrier`` - Bool indicating whether this is a barrier operation
+#. ``shader_stages`` - shader stages to which this applies, empty for all.
+#. ``shader_model`` - minimum shader model required (e.g., 6, 0)
+#. ``inst_helper_prefix`` - None
+#. ``fully_qualified_name_prefix`` - Constant string ``"hlsl::OP::OpCode"``
+#. ``is_dxil_op`` - Bool that evaluates (dxil_op != "") indicating whether this is a DXIL operation
+#. ``is_reserved`` - Bool that evaluates (dxil_class == "Reserved")
+#. ``shader_model_translated`` - minimum shader model required with translation by linker
+#. ``props`` - extra properties
+
+Motivation
+==========
+
+``DXILLowering`` pass needs to lower the LLVM intrinsics. TableGen file -
+``llvm/lib/Target/DirectX/DXIL.td`` - is used to specify the properties of DXIL
+Ops including the mapping of each of them to LLVM intrinsics they correspond to, if any.
+``utils/hct/hctdb.py`` serves this purpose in ``DirectXShaderCompier`` repo.
+Anologously, ``DXIL.td`` is planned to be the single source of reference
+for the properties and LLVM intrinsic mapping of DXIL Ops for DXIL backend
+implemetation in ``llvm-project`` repo. It needs to have a rich representation
+abilities that TableGen backends (such as ``DXILEmitter``) can rely on. Additionally,
+the DXIL Op specification should be easy to read and comprehend.
+
+Design
+======
+
+Distilling the essential attributes of DXIL Op from the above (as a start), following
+attributes form the core of its specification.
+
+#. ``dxil_opid`` or ``OpCode``
+#. ``dxil_class`` or ``OpClass`` - this string is an integral part of the DXIL Op function name and is constructed in the format ``dx.op.<class-name>.<overload-type>``. The DXIL validator checks for any deviation from this for each of the DXIL Op call.
+#. ``ops`` - list of operands encapsulating the index and valid (fixed or overload) types
+#. ``oload_types`` - Valid overload types of the DXIL op
+
+Each of the LLVM intrinsics maps to an external function represented by a call to an
+external function of the form ``dx.op.<class-name>.<overload-type>`` as noted above.
+
+Following is a basic TableGen class structure to encapsulate the mapping of LLVM Intrinsics to DXIL Ops.
+
+.. code-block::
+
+ // Abstraction of DXIL Operation to LLVM Intrinsic mapping
+ class DXILOpMappingBase {
+ int OpCode = 0; // Opcode of DXIL Operation
+ DXILOpClass OpClass = UnknownOpClass;// Class of DXIL Operation.
+ Intrinsic LLVMIntrinsic = ?; // LLVM Intrinsic DXIL Operation maps to
+ string Doc = ""; // A short description of the operation
+ list<LLVMType> OpTypes = ?; // Valid types of DXIL Operation in the
+ // format [returnTy, param1ty, ...]
+ }
+
+Various options considered to represent this mapping - keeping the goals of rich
+representation and readability stated above - are discussed in the remainder
+of the note. The basic difference between these options is the way return and
+parameter types are represented for DXIL Ops with valid overload types.
+Valid overload types for several DXIL Ops would be over-specified using LLVM's
+``llvm_any*_ty`` types. For example, ``half`` and ``float`` are only valid for
+DXIL ``Sin`` and would be overspecified using ``llvm_anyfloat_ty``. The options
+listed below address the need to model such overload types specific types
+precisely for correct code generation. They each provide specifications with
+varying levels in (a) ease of readablility and maintainability and
+(b) of compactness / richness.
+
+Option 1 : Specify ``OpType`` as a list of valid fixed types.
+-------------------------------------------------------------
+
+``OpTypes`` for ``Sin`` may be specified as
+``[[llvm_i16, llvm_i32], [llvm_i16, llvm_i32]]``. Repeating such lists for each
+of the DXIL Ops - not all of which are unary - reduces readability and increases
+the proclivity for errors in specification and maintenance. Even if one can
+consider usage of TableGen definitions to create shorthand concrete record
+defs for these, above stated problems are barely mitigated.
+
+Option 2 : Specify a function to validate accepted overload types
+-----------------------------------------------------------------
+
+Specify a validation function to verify/generate the accepted set of overload
+types for DXIL Ops as a field of ``class DXILOpMappingBase``. This function is
+expected to be invoked by the TableGen backends to generate relevant ``*.inc``
+files with accurate content. Such a specification can provide relief from the
+need to specify and maintain long lists of OpTypes. However, having such set
+of functions fits rather awkwardly with the TableGen API usage of being able
+to directly work with the content of a record. Further, these validation
+functions add to the maintenace overlead while not not necessarily making
+the specification more readable.
+
+Option 3a : Specify ``OpTypes`` as an override of list valid fixed types
+------------------------------------------------------------------------
+[**Current strawman implementation**]
+
+Inherit the valid types of the LLVM Intrinsic being lowered as valid for the
+DXIL Op, by default. This will reduce the need to specify a ``OpTypes`` list
+for those DXIL Ops with the same valid types as the LLVM Intrinsic. In cases
+where it is not the case (such as ``Sin``), an optional list that overrides
+the default inheritence should be specified. This improves the readability
+by eliminating specification of ``OpType`` lists, when not needed. A
+relatively small set of precise overload types that are specific to DXIL Ops
+are defined to further improve readability. Such types
+(e.g., ``llvm_halforfloat_ty``) are defined using standard LLVM MVT
+kinds (viz., ``MVT::Other``).
+
+For example, following is the specification of ``Sin`` where the default
+type inheritence is overridden via explicit specification of valid overload
+types that are more precise.
+
+.. code-block::
+
+ def Sin : DXILOpMapping<13, unary, int_sin,
+ "Returns sine(theta) for theta in radians.",
+ [llvm_halforfloat_ty, LLVMMatchType<0>]>;
+
+Following is the specification of ``ThreadId`` where the types of the LLVM
+intrinsic (``int_dx_thread_id``) are valid for ``dx.op.threadId.*`` and
+need not be overridden.
+
+.. code-block::
+
+ def ThreadId : DXILOpMapping<93, threadId, int_dx_thread_id,
+ "Reads the thread ID">;
+
+The backends can consume such specification without needing to execute
+additional validation function or code. Such specification provides better
+readability and the necessary type information. It does not completely
+eliminate the mechanism of using lists as ``OpTypes``, but DXIL Ops that
+need ``OpTypes`` lists could be lesser.
+
+Option 3b : Specify ``OpTypes`` as an exclusion list of valid fixed types
+-------------------------------------------------------------------------
+
+Another variant of the Option 3a is to specify an exclusion list. An
+exclusion list instead of an override list provides a list of fixed types
+not valid for an DXIL Op and thus need to be excluded from a valid overload
+type lsit of LLVM Intrinsic. The benefits and downsides of this are the same
+as those of specifying an override list as in Option 3a.
+
+Option 4 : Specify accepted overload types as ``Attr`` records
+---------------------------------------------------------------
+
+LLVM's TableGen infrastructure defines a base ``class Attr``
+(``llvm/include/llvm/IR/Attributes.td``) with an associated
+``AttrProperty``. Valid overload types of a DXIL Op can be represented as
+``Attr`` records, distinguished via ``AttrProperty`` to model precise types
+as needed. This can provide the necessary readability and the richness of
+specification. Additionally, the other properties of a DXIL Op (such as the
+``bool is_*``) can also be uniformly represented as ``Attr`` records.
+
+Summary
+=======
+
+This note discusses various design options that have been explored to implement
+a representation of DXIL Ops in ``DXIL.td``. ``DXIL.td`` is intended to serve as
+a single source of reference for TableGen backends (such as ``DXILEmitter`` -
+specific to DXIL backend), have an accurate and rich specification, be readable
+and maintainable. The current implementation employs Option 3a. It is in place,
+primarily, to facilitate lowering of new LLVM intrinsics for HLSL functions
+being added in the front end. It serves to expose any additional considerations
+necessary for an improved design of ``DXIL.td``. The current plan is to design
+and implement **Opton 4** to improve readability and maintainablity while
+leveraging constructs in LLVM TableGen infrastructure for a potentially rich
+specification.
|
1df0d9b
to
8876fd6
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My $0.02 on reading the content here is that options 2 and 3b don't really accomplish our key goal of making the tablegen a source of truth for DXIL specification.
With option 2 we need to have an external tool that validates the tablegen, which means the tablegen files themselves can't be treated as the source of truth.
For option 3b, again, the tablegen isn't really a clear source of truth. I think having an exclusion list is fragile (what happens if we add a new type), and much less intuitive because nothing about the op declaration clearly conveys that these are the disallowed types rather than the allowed types you just have to know it.
Option 4 seems confusing to me. LLVM's Attr class is for defining IR attributes. I'm not sure how or why we would use attributes for defining the valid overloads.
IIUC, the only real difference between option 1 and option 3a, is that option 3a uses types that can map to one or more type and type matching rather than an array of explicit overloads.
If option 3a is expressive enough to cover all our use cases, that seems fine to me. It also would be trivial to allow a hybrid solution where OpTypes
can either be an array of type specifiers or an array of arrays of type specifiers. That would actually allow solutions 1 and 3a to be used interchangeably or even in mixed capacity. Like:
def SillyNewDXOp : DXILOpMapping<13, unary, int_dx_some_new_intrinsic,
"Returns magic and rainbows!",
[[llvm_halforfloat_ty, LLVMMatchType<0>], [llvm_float_ty, llvm_double_ty]]>;
That would allow us the flexibility to simplify where reasonable but still have explicit control where we need it.
I have a strong preference for either option 1, 3a, or a hybrid approach. I think that is the most straight forward and simple. Curious if @bogner has other thoughts.
Is the return type list as shown in the above sample intended to demonstrate the list of all valid overload return types of The intent behind defining the more precise overload types It appears that very few additional precise overload types need to be defined. For example, a common overload type that abstracts an overload of
I think the specification methodology of option 3a provides the required flexibility and readability. |
* "Standard" LLVM intrinsics (e.g., ``llvm.sin.*``) and | ||
* HLSL intrinsics (defined as LLVM intrinsics in ``llvm/include/llvm/IR/IntrinsicsDirectX.td``, e.g., ``llvm.dx.*``) | ||
|
||
These are collectively referred to as `LLVM Intrinsics` in this note. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't render correctly - I think you need to indent 3 spaces to match the start of "Using LLVM External..."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The rendering still seems off. Have you considered merging the two lists together?
#. Using LLVM instructions (standard LLVM intrinsics, e.g., llvm.sin.*
)
#. Using LLVM External functions, also called HLSL intrinsics. These are defined in llvm/include/llvm/IR/IntrinsicsDirectX.td
(e.g., llvm.dx.*
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The rendering still seems off...
Thanks! Fixed.
8cb658d
to
aa9b978
Compare
Can you include in the examples for each of the options how we would represent the requirements for specific overloads? For example the |
Added a section with the details. Thanks! |
9421eca
to
ee13241
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Based on the latest comments and updates. It seems like we need some variation of option 1 or 3 with a way to represent an overload as not just parameters but also some validation.
@bharadwajy, can you do a cleanup pass on the document to reflect either the single planned direction or the narrower list of options we're still considering?
ee13241
to
56849cd
Compare
Updated the document. Implementation of the design described in the current version of this document is in PR #87803. |
def Sin : DXILOpMapping<13, int_sin, | ||
[DXILOpOverload<SM6_3, [llvm_half_ty, llvm_float_ty]>, | ||
DXILOpOverload<SM6_0, [llvm_float_ty]>], | ||
"Returns sine(theta) for theta in radians.">; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is one more odd availability case that this notation doesn't handle (although I suspect it can be extended to do so). The ddx
intrinsic is a good example:
https://learn.microsoft.com/en-us/windows/win32/direct3dhlsl/dx-graphics-hlsl-ddx
ddx
was added for pixel shaders in SM 2.x, but was not added to compute shaders until SM 6.6:
https://microsoft.github.io/DirectX-Specs/d3d/HLSL_SM_6_6_Derivatives.html
The overload is the same, but the availability is by-stage and changes in later versions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is one more odd availability case that this notation doesn't handle (although I suspect it can be extended to do so). The
ddx
intrinsic is a good example:
...
ddx
was added for pixel shaders in SM 2.x, but was not added to compute shaders until SM 6.6:
...
The overload is the same, but the availability is by-stage and changes in later versions.
Thanks! I plan to extend the specification ability to include Shader Kind as another refinement criterion in a (near-term) future PR.
56849cd
to
9de3926
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a design document that outlines the choices considered and current implementation.
Delete text describing options not pursued.
Also incorporate PR suggestion to add descriptions of all properties of DXIL Operations.
9de3926
to
531550e
Compare
class OpAttributes; | ||
|
||
- memory access - ``ReadNone``, ``ReadNone`` | ||
- ``IsDerivative``, ``IsGradient``, ``IsFeedback``, ``IsWave``, ``NeedsUniformInputs``, ``IsBarrier`` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As we discussed this morning I'm unsure these properties are required.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As we discussed this morning I'm unsure these properties are required.
Updated per discussion.
def Cos : DXILOpProperties<12, int_cos, [llvm_half_ty, llvm_float_ty], | ||
"Returns cosine(theta) for theta in radians.">; | ||
def Sin : DXILOpProperties<13, int_sin, [llvm_half_ty, llvm_float_ty], | ||
"Returns sine(theta) for theta in radians.">; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm curious what the representation looks like for a more exotic DXIL Op. Like maybe ddx
? How do we represent that ddx
is available in SM 2.0+ for Pixel shaders and SM 6.6+ for Compute, Mesh and Amplification?
How do we handle RawBufferLoad
where the double
and i64
overloads are only valid for SM 6.3+?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm curious what the representation looks like for a more exotic DXIL Op. Like maybe
ddx
? How do we represent thatddx
is available in SM 2.0+ for Pixel shaders and SM 6.6+ for Compute, Mesh and Amplification?How do we handle
RawBufferLoad
where thedouble
andi64
overloads are only valid for SM 6.3+?
Thanks for flagging these. I've noted this and plan to update the design spec in subsequent PRs, if needed, as I make progress with implementation and learn about the specific requirements to represent DXIL ops that may not be sufficiently represented using the current specification mechanism.
DXIL backend only.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As Chris mentioned in #85170 (comment), there are a few glaring gaps here that I think will need to be resolved in order to generate correct DXIL:
- Derivatives, which became valid in more shader stages in SM6.6
- Atomics, which gained 64 bit overloads in SM6.6
- Sample ops gained integer overloads in SM6.7
- Sample and Texture ops may be affected by SM6.7 programmable offsets
It isn't ideal to move forward with this without at least having an idea of how we'll handle those scenarios. However, it's probably sensible to get this in tree and iterate on it there as the design evolves.
9. A documentation string for the operation. | ||
|
||
|
||
A DXIL Operation is represented by the following TableGne class by encapsulating the various |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A DXIL Operation is represented by the following TableGne class by encapsulating the various | |
A DXIL Operation is represented by the following TableGen class by encapsulating the various |
|
||
1. Required minimum Shader Model (``shader_model``). | ||
2. Minimum shader model required with translation by linker (``shader_model_translated``) | ||
3. List of shader stages applicable to (``shader_stages``), empty for all. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like the valid shader stages is as relevant to the backend as the required DXIL version (ie, for validation that we're generating correct DXIL).
@bharadwajy this seems to break the docs build with:
See for example: https://github.com/llvm/llvm-project/actions/runs/9306373243/job/25615439087?pr=93801 |
@jayfoad Thanks for letting me know of this. I have a change that I believe would fix this failure. However, I am unable to verify it on my dev machines as I see the following failure.
I have not found a way to address this issue after a quick internet search. Would you have any suggestions? |
Never mind. I temporarily added Created PR #93864 |
Add an initial design document for TableGen specification of DXIL Operations.
Issue #84932