Skip to content

Integrated matrix extension#28

Open
joseemoreira wants to merge 24 commits intoriscv:integrated-matrix-extensionfrom
joseemoreira:integrated-matrix-extension
Open

Integrated matrix extension#28
joseemoreira wants to merge 24 commits intoriscv:integrated-matrix-extensionfrom
joseemoreira:integrated-matrix-extension

Conversation

@joseemoreira
Copy link
Copy Markdown
Collaborator

Philipp, Erich, and Bing: Please take a look at the changes I am proposing. They are meant to allow (but not mandate) compatibility of floating-point results with Zvt (https://github.com/aswaterman/riscv-misc/blob/main/isa/zvt/zvt.adoc#matrix-arithmetic-instructions).

I am just looking for feedback at this point. I suspect we will need some discussions before we are all OK with any of these changes. When we get there, I can update the SAIL.

I have not found a good way to have G as a WARL field that one can discover. It is not something that can be set in vtype, because it depends on non-vtype parameters, such as W.

Signed-off-by: Jose Moreira <jmoreira@us.ibm.com>
Signed-off-by: Jose Moreira <jmoreira@us.ibm.com>
Signed-off-by: Jose Moreira <jmoreira@us.ibm.com>
Signed-off-by: Jose Moreira <jmoreira@us.ibm.com>
Signed-off-by: Jose Moreira <jmoreira@us.ibm.com>
Signed-off-by: Jose Moreira <jmoreira@us.ibm.com>
Signed-off-by: Jose Moreira <jmoreira@us.ibm.com>
Signed-off-by: Jose Moreira <jmoreira@us.ibm.com>
Signed-off-by: Jose Moreira <jmoreira@us.ibm.com>
Signed-off-by: Jose Moreira <jmoreira@us.ibm.com>
Signed-off-by: Jose Moreira <jmoreira@us.ibm.com>
Signed-off-by: Jose Moreira <jmoreira@us.ibm.com>
Signed-off-by: Jose Moreira <jmoreira@us.ibm.com>
Signed-off-by: Jose Moreira <jmoreira@us.ibm.com>
Signed-off-by: Jose Moreira <jmoreira@us.ibm.com>
Signed-off-by: Jose Moreira <jmoreira@us.ibm.com>
Signed-off-by: Jose Moreira <jmoreira@us.ibm.com>
Signed-off-by: Jose Moreira <jmoreira@us.ibm.com>
Signed-off-by: Jose Moreira <jmoreira@us.ibm.com>
Signed-off-by: Jose Moreira <jmoreira@us.ibm.com>
Signed-off-by: Jose Moreira <jmoreira@us.ibm.com>
Signed-off-by: Jose Moreira <jmoreira@us.ibm.com>
Signed-off-by: Jose Moreira <jmoreira@us.ibm.com>
Signed-off-by: Jose Moreira <jmoreira@us.ibm.com>
Copy link
Copy Markdown
Collaborator

@ptomsich ptomsich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please split into individual PRs for individual topics/semantic changes:

  • the editorial changes can be reviewed separately — and are mostly non-controversial
  • profiles belong in a different document
  • G, W, and psm are controversial (at least G and W cross red lines for me) and I'd prefer to discuss those separately.

|Zvvfp64mm ^| Zve64d | IEEE binary64, IEEE binary64 | IEEE binary64
|===

=== Recommended Profiles
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As part of the standards-development and ratification-process, we submit a PR for the entire chapter into the ISA manual. As a direct consequence, we can't include text that is outside of our scope (the Profiles SIG and the TSC are in charge of profiles).

We need to split this section out of the .adoc — how about a separate file that we can send to Profiles SIG as a proposal?

@joseemoreira If it helps, I can have you put on the agenda at TSC to give a presentation on "The need to define profiles for IME — with suggestions."?

|===

[#integrated-matrix-psm]
==== Partial-sum mode (`psm`)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels odd as a vtype-field, as only one of the modes will typically be supported.
I wonder if this is rather an ABI issue: tag ELF objects as THIS, THAT, DONT_CARE ... and leave it to the linker and runtime linker to figure out?

My recommendation: don't have this as a vtype-field. Given that the vtype is "a split-out part of the opcde", having a vtype-field is equivalent to having two different opcodes/mnemonics . I don't think this is the intent?

The `psm` field affects only floating-point Zvvm instructions. It has no effect on integer matrix multiply-accumulate instructions or on tile load/store instructions.

[#integrated-matrix-W]
==== Arithmetic widening factor (`W`)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NAK.

The widening-factor is part of what the instruction does, so it's part of the mnemomic.
We have the bits available in the opcode, so there's no need to make this indirect (by forcing extra vsetvl instructions).

| 1 | 1 | 8
|===

The `W` field is located at `vtype[XLEN-9:XLEN-10]`.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We were asked to stay out of the immediate vtype-bits and consequently kept the widening inside the explicit opcode bits.

Unless VME uses the same bits, there is no reason why this could make sense.
Given that VME decided they want their own mtype: NAK.

[#integrated-matrix-G]
==== Partial-sum grouping factor (`G`)

The `G[2:0]` field is a 3-bit WARL read/write field in the `vtype` CSR that selects the grouping factor used to form floating-point partial sums in Zvvm floating-point matrix multiply-accumulate instructions.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this really make sense? Will G be selectable?
Or will it rather be a consequence of the ( \lamba x SEW x W x LMUL ) combination for a given hardware?

My guess is that it will be the latter … so it's not something you select, but something you acknowledge.
In that case, instead of a WARL-field, we might want to define a standard way (as part of the ABI) how to convey this information to user-software?

if (2 ^ get_sew_pow()) < 32 then return Illegal_Instruction();

let g : gemm_geom = decode_gemm_geometry(8, false);
let g : gemm_geom = decode_gemm_geometry(read_vtype_W(), false);
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NAK. And independently of the NAK: if you change the unrolling factor in one line, you'd have to change it everywhere (e.g., 3 lines below, we call check_microscaling_legality with W).

ptomsich added a commit to ptomsich/integrated-matrix-extension that referenced this pull request Apr 7, 2026
Process all 28 items from the IME TG internal review feedback tracker.

Subextension dependencies (riscv#3):
  Replace blanket Zve64d dependency with the minimum Zve subset per
  subextension: Zve32x for integer accumulators ≤ 32-bit, Zve64x for
  Int64 accumulators, Zve32f for FP accumulators ≤ 32-bit, and Zve64d
  only for FP64 accumulators.

8× widening instructions (riscv#7, riscv#8, riscv#9, riscv#24):
  Add v8wmmacc.vv (funct6=0x3b, OPIVV), vf8wmmacc.vv (funct6=0x17,
  OPFVV), and vf8wimmacc.vv (integer-input MX variant, vm=0 of
  v8wmmacc) with full instruction definitions, SAIL pseudocode,
  encoding diagrams, and exception tables.  Update encoding maps (FP,
  integer, integer MX) with W=8 entries.  Add Zvvxi4fp32mm and
  Zvvxni4fp32mm to the MX subextension table.  Replace the informative
  NOTE about reserved W=8 encoding space with normative text.  Remove
  the undefined term "octal-widening".

MXINT4 clarification and OCP citation (riscv#14):
  Define MXINT4 as analogous to OCP MX's MXINT8 but with 4-bit signed
  elements.  Add proper citation of the OCP Microscaling Formats (MX)
  v1.0 Specification with URL.  Update microscaling applicability to
  include vf8wmmacc.vv.

vfmmacc.vv vm=0 cleanup (riscv#13, riscv#28):
  Remove contradictory "When vm=0" exception bullets (vm=0 is reserved
  for non-widening FP).  Replace dead microscaling SAIL code with a
  straightforward non-widening FP GEMM loop.  Add explicit note that
  microscaling is not supported for non-widening multiply-accumulate.

Terminology fixes (riscv#15, riscv#21):
  Add forward cross-reference at first use of altfmt_A/altfmt_B.
  Correct two occurrences where λ was described as "the K dimension"
  to "tile-layout parameter", clarifying that K_eff = λ × W × LMUL is
  the derived effective K dimension.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants