Updates to Introduction and edits to bit locations by joseemoreira · Pull Request #15 · riscv/integrated-matrix-extension

joseemoreira · 2026-03-07T12:32:40Z

I know I am late to this, but getting there. Everything I read so far (up to Storage Formats) looks great to me! I just made some minor changes that I hope are non-controversial.

joseemoreira · 2026-03-07T14:07:02Z

I added some more changes to the floating-point semantics. I expect those to be somewhat more contentious. Looking forward to hearing from @ptomsich and @efocht-oct.

I am reviewing the micro-scaling next.

ptomsich · 2026-03-07T15:01:37Z

 where the roundings are performed with the rounding mode from `frm`.
 The rounding of partial sum S _before_ it is accumulated to the running value of C[m,n] is optional.

-* After each group, the accumulated partial sum is rounded to C's precision (SEW) using an _implementation-defined_ rounding mode and _added to the running value of C[m, n]_.


Why specify this intermediate rounding for partial sums, if it's implementation defined?

"Implementation defined" is too vague and cannot be tested for compliance. I have restricted things a bit, so that the implementations can be more easily tested and matches industry practice. If there are more things we want to license, we can add that. But the way it is there now enables pretty much anything one wants to do.

joseemoreira · 2026-03-07T15:09:27Z

There are additional edits for tile loads/stores. Special case for (rs2) = 0 so that hardware can optimize for the micro-kernel.

ptomsich · 2026-03-07T18:24:26Z

Just a quick note on the bit locations: we inserted the (original) editorial note, as we couldn't make a world-breaking change across our QEMU, testsuite, etc. and keep the timeline for this document drop…

joseemoreira · 2026-03-08T02:31:56Z

I am merging this pull request after a few edits for clarity. In particular, I simplified the guidelines for VLEN-portable code to include the "dynamic code path" form that we previously discussed as the preferred approach.

* Updates to the introduction * Editorial notes on bit locations * Revised floating-point rounding rules * Revised floating-point rounding rules * Revised floating-point rounding rules * Special case for tile loads/stores when (rs2) = 0 * Inputs don't have their own SEW, just EEW * Added arithmetic considerations to mixed-format inputs * Added arithmetic considerations to mixed-format inputs * Made semantics of micro-scaling computations clearer * Used byte addresses in the definitions of tile load/store * Used byte addresses in the definitions of tile load/store * Clarify valid values of VL * Clarify that tile loads must use target SEW * Clarify guidelines for portable IME code

Process all 28 items from the IME TG internal review feedback tracker. Subextension dependencies (#3): Replace blanket Zve64d dependency with the minimum Zve subset per subextension: Zve32x for integer accumulators ≤ 32-bit, Zve64x for Int64 accumulators, Zve32f for FP accumulators ≤ 32-bit, and Zve64d only for FP64 accumulators. 8× widening instructions (#7, #8, #9, #24): Add v8wmmacc.vv (funct6=0x3b, OPIVV), vf8wmmacc.vv (funct6=0x17, OPFVV), and vf8wimmacc.vv (integer-input MX variant, vm=0 of v8wmmacc) with full instruction definitions, SAIL pseudocode, encoding diagrams, and exception tables. Update encoding maps (FP, integer, integer MX) with W=8 entries. Add Zvvxi4fp32mm and Zvvxni4fp32mm to the MX subextension table. Replace the informative NOTE about reserved W=8 encoding space with normative text. Remove the undefined term "octal-widening". MXINT4 clarification and OCP citation (#14): Define MXINT4 as analogous to OCP MX's MXINT8 but with 4-bit signed elements. Add proper citation of the OCP Microscaling Formats (MX) v1.0 Specification with URL. Update microscaling applicability to include vf8wmmacc.vv. vfmmacc.vv vm=0 cleanup (#13, #28): Remove contradictory "When vm=0" exception bullets (vm=0 is reserved for non-widening FP). Replace dead microscaling SAIL code with a straightforward non-widening FP GEMM loop. Add explicit note that microscaling is not supported for non-widening multiply-accumulate. Terminology fixes (#15, #21): Add forward cross-reference at first use of altfmt_A/altfmt_B. Correct two occurrences where λ was described as "the K dimension" to "tile-layout parameter", clarifying that K_eff = λ × W × LMUL is the derived effective K dimension.

joseemoreira added 2 commits March 7, 2026 07:17

Updates to the introduction

a83fedd

Editorial notes on bit locations

6896b5b

joseemoreira requested a review from ptomsich March 7, 2026 12:32

joseemoreira added 2 commits March 7, 2026 08:49

Revised floating-point rounding rules

8486949

Revised floating-point rounding rules

9385cdd

joseemoreira requested a review from efocht-oct March 7, 2026 14:04

Revised floating-point rounding rules

f5f1079

ptomsich reviewed Mar 7, 2026

View reviewed changes

joseemoreira added 2 commits March 7, 2026 10:03

Special case for tile loads/stores when (rs2) = 0

828e9d0

Inputs don't have their own SEW, just EEW

f91b954

joseemoreira added 8 commits March 7, 2026 20:29

Added arithmetic considerations to mixed-format inputs

096ce7c

Added arithmetic considerations to mixed-format inputs

b9e8dbb

Made semantics of micro-scaling computations clearer

85bc8ec

Used byte addresses in the definitions of tile load/store

d0682f9

Used byte addresses in the definitions of tile load/store

c55a274

Clarify valid values of VL

866db98

Clarify that tile loads must use target SEW

fe19c3b

Clarify guidelines for portable IME code

a564fff

joseemoreira merged commit daae96e into riscv:integrated-matrix-extension Mar 8, 2026
3 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Updates to Introduction and edits to bit locations#15

Updates to Introduction and edits to bit locations#15
joseemoreira merged 15 commits intoriscv:integrated-matrix-extensionfrom
joseemoreira:integrated-matrix-extension

joseemoreira commented Mar 7, 2026

Uh oh!

joseemoreira commented Mar 7, 2026

Uh oh!

ptomsich Mar 7, 2026

Uh oh!

joseemoreira Mar 7, 2026

Uh oh!

joseemoreira commented Mar 7, 2026

Uh oh!

ptomsich commented Mar 7, 2026

Uh oh!

joseemoreira commented Mar 8, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

joseemoreira commented Mar 7, 2026

Uh oh!

joseemoreira commented Mar 7, 2026

Uh oh!

ptomsich Mar 7, 2026

Choose a reason for hiding this comment

Uh oh!

joseemoreira Mar 7, 2026

Choose a reason for hiding this comment

Uh oh!

joseemoreira commented Mar 7, 2026

Uh oh!

ptomsich commented Mar 7, 2026

Uh oh!

joseemoreira commented Mar 8, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants