@@ -452,6 +452,208 @@ NOTE: `setjmp`/`longjmp` follow the standard calling convention, which clobbers
452452all vector registers. Hence, the standard vector calling convention variant
453453won't disrupt the `jmp_buf` ABI.
454454
455+ NOTE: Functions that use the standard vector calling convention
456+ variant follow an additional name mangling rule for {Cpp}.
457+ For more details, see <<Name Mangling for Standard Calling Convention Variant>>.
458+
459+ === Standard Fixed-length Vector Calling Convention Variant
460+
461+ This section defines the calling convention variant for fixed-length vectors.
462+ The intention of this variant is to pass fixed-length vectors via the vector
463+ registers. For the definition of a fixed-length vector, see
464+ <<Fixed-Length Vector>>.
465+
466+ This variant is based on the standard vector calling convention variant:
467+ the register convention and the rules for passing arguments and return values
468+ are the same.
469+
470+ NOTE: The reason we define a separate calling convention variant is that we
471+ would like to define a flexible convention to utilize the variable length
472+ feature in the vector extension, also considering embedded vector extensions,
473+ such as `Zve32x`.
474+
475+ ABI_VLEN refers to the width of a vector register in the calling convention
476+ variant.
477+
478+ The ABI_VLEN must be no wider than the ISA's VLEN, meaning that the ISA may
479+ support wider vector registers than the ABI, but the ABI's VLEN cannot exceed
480+ the ISA's VLEN.
481+
482+ ABI_VLEN represents the width (in bits) of the vector registers available in the
483+ calling convention for fixed-length vectors. ABI_VLEN can vary from 32 bits
484+ (as in `Zve32x`) up to the maximum supported by the ISA. The flexibility of
485+ ABI_VLEN enables the convention to adapt to both low-end embedded systems and
486+ high-performance processors that utilize wider vector registers.
487+
488+ The ABI_VLEN is a parameter of this calling convention variant. It could be set
489+ by a command line option for the compiler or specified by a function
490+ attribute in the source code.
491+
492+ NOTE: We suggest the toolchain implementation set the default value of ABI_VLEN
493+ to 128, as it's the most common minimal requirement. However, it is not fixed
494+ to 128, since the ISA allows the VLEN to be only 32 bits or 64 bits. This
495+ also enables the utilization of the capacity of longer VLEN. Users can build
496+ with an optimized library with larger ABI_VLEN for better utilization of those
497+ cores with longer VLEN.
498+
499+ A fixed-length vector argument is passed in one vector argument register if the
500+ size of the vector is less than or equal to ABI_VLEN bit.
501+
502+ [NOTE]
503+ ====
504+ Even in the absence of specific vector extension support for certain element
505+ types, such as `__bf16`, `_Float16`, `float`, or `double`, the standard
506+ fixed-length vector calling convention rules still apply. For example,
507+ even without the support of extensions like `Zvfbfmin`, `Zve32f`, or `Zve64d`,
508+ these element types will be passed according to the calling convention rules
509+ outlined here.
510+
511+ Additionally, data types such as `__int128_t`, which currently do not
512+ have direct support in any vector extension, will also follow these rules.
513+ This design ensures that the calling convention remains forward-compatible,
514+ minimizing the need for continuous adjustments as new extensions and data types
515+ are introduced in the future.
516+
517+ The consistency in applying these rules to unsupported element types guarantees
518+ a smooth transition when future vector extensions become available, allowing for
519+ seamless integration of new features without requiring significant changes to
520+ the calling convention.
521+ ====
522+
523+ A fixed-length vector argument is passed in two vector argument registers,
524+ similar to vector data arguments with LMUL=2 and following the same register
525+ constraints, if the size of the vector is greater than ABI_VLEN bits and less
526+ than or equal to 2×ABI_VLEN bits.
527+
528+ A fixed-length vector argument is passed in four vector argument registers,
529+ similar to vector data arguments with LMUL=4 and following the same register
530+ constraints, if the size of the vector is greater than ABI_VLEN bits and less
531+ than or equal to 4×ABI_VLEN bits.
532+
533+ A fixed-length vector argument is passed in eight vector argument registers,
534+ similar to vector data arguments with LMUL=4 and following the same register
535+ constraints, if the size of the vector is greater than ABI_VLEN bits and less
536+ than or equal to 8×ABI_VLEN bits.
537+
538+ [NOTE]
539+ ====
540+ Fixed-length vectors that are not a power-of-2 in size will be rounded up to
541+ the next power-of-2 length for the purpose of register allocation and handling.
542+ For instance, a vector type like `int32x3_t` (which contains three 32-bit
543+ integers) will be treated as an `int32x4_t` (a 128-bit vector, as LMUL=1 for
544+ ABI_VLEN=128) in the ABI, and passed accordingly. This ensures consistency in
545+ how vectors are handled and simplifies the process of argument passing.
546+
547+ Example: Consider an `int32x3_t` vector (three 32-bit integers):
548+ - The vector's total size is 96 bits, which is not a power of 2.
549+ - The ABI will round up the size to 128 bits (corresponding to `int32x4_t`),
550+ meaning the vector will be passed using one vector argument register when
551+ ABI_VLEN=128.
552+
553+ This rule applies to all non-power-of-2 fixed-length vectors, ensuring they
554+ are treated consistently across different ABI_VLEN settings.
555+ ====
556+
557+ A fixed-length vector argument is passed by reference and is replaced in the
558+ argument list with the address if it is larger than 8×ABI_VLEN bit or if
559+ there is a shortage of vector argument registers.
560+
561+ A struct containing members with all fixed-length vectors will be passed in
562+ vector argument registers like a vector tuple type if all members have the
563+ same length, the length is less than or equal to 4×ABI_VLEN bit, and the size of
564+ the whole struct is less than or equal to 8×ABI_VLEN bit.
565+ If there are not enough vector argument registers to pass the entire struct,
566+ it will pass by reference and is replaced in the argument list with the address.
567+ Otherwise, it will use the rule defined in the hardware floating-point calling
568+ convention.
569+
570+ A struct containing just one fixed-length vector or a fixed-length vector
571+ array of length one, will be flattened as a single fixed-length vector argument
572+ if the size of the vector is less than or equal to 8×ABI_VLEN bit.
573+
574+ Structs with zero-length fixed-length arrays use the rule defined in the hardware
575+ floating-point calling convention, which means it won't consume vector argument
576+ register either in C or {Cpp}.
577+
578+ A struct containing just one fixed-length vector array is passed as though it
579+ were a vector tuple type if the size of the base element for the array is less than
580+ or equal to 8×ABI_VLEN bit, and the size of the array is less than 8×ABI_VLEN
581+ bits.
582+ If there are not enough vector argument registers to pass the entire struct,
583+ it will pass by reference and is replaced in the argument list with the address.
584+ Otherwise, it will use the rule defined in the hardware floating-point
585+ calling convention.
586+
587+ Unions with fixed-length vectors are always passed according to the integer
588+ calling convention.
589+
590+ The details of vector argument register rules are the same as the standard
591+ vector calling convention variant.
592+
593+ NOTE: Functions that use the standard fixed-length vector calling convention
594+ variant must be marked with STO_RISCV_VARIANT_CC. See <<Dynamic Linking>>
595+ for the meaning of STO_RISCV_VARIANT_CC.
596+
597+ NOTE: Functions that use the standard fixed-length vector calling convention
598+ variant follow an additional name mangling rule for {Cpp}.
599+ For more details, see <<Name Mangling for Standard Calling Convention Variant>>.
600+
601+ [NOTE]
602+ ====
603+ When ABI_VLEN is smaller than the VLEN, the number of vector argument
604+ registers utilized remains unchanged. However, in such cases, values are only
605+ placed in a portion of these vector argument registers, corresponding to the
606+ size of ABI_VLEN. The remaining portion of the vector argument registers, which
607+ extends beyond the ABI_VLEN, will remain idle. This means that while the full
608+ capacity of the vector argument registers may not be used, the allocation of
609+ these registers do not change, ensuring consistency in register usage regardless
610+ of the ABI_VLEN to VLEN ratio.
611+
612+ Example: With ABI_VLEN at 32 bits and VLEN at 128 bits, consider passing an
613+ `int32x4_t` parameter (four 32-bit integers).
614+
615+ Allocation: Four vector argument registers are allocated for
616+ `int32x4_t`, based on LMUL=4.
617+
618+ Utilization: All four integers are placed in the first vector register,
619+ utilizing its full 128-bit capacity (VLEN), despite ABI_VLEN being 32 bits.
620+
621+ Remaining Registers: The other three allocated registers remain unused and idle.
622+
623+ .int32x4_t layout on different VLEN with ABI_VLEN at 32 bits:
624+ [cols="2,3,3,3,3"]
625+ [width=100%]
626+ |===
627+ | VLEN | v8 | v9 | v10 | v11
628+
629+ | 32 | a | b | c | d
630+ | 64 | a, b | c, d | -, - | -, -
631+ | 128 | a, b, c, d | -, -, -, - | -, -, -, - | -, -, -, -
632+ | 256 | a, b, c, d, -, -, -, - | -, -, -, -, -, -, -, - | -, -, -, -, -, -, -, - | -, -, -, -, -, -, -, -
633+ |===
634+
635+ .int64x8_t layout on different VLEN with ABI_VLEN at 128 bits:
636+ [cols="2,3,3,3,3"]
637+ [width=100%]
638+ |===
639+ | VLEN | v8 | v9 | v10 | v11
640+
641+ | 128 | a, b | c, d | e, f | g, h
642+ | 256 | a, b, c, d | e, f, g, h | -, -, -, - | -, -, -, -
643+ | 512 | a, b, c, d, e, f, g, h | -, -, -, -, -, -, -, - | -, -, -, -, -, -, -, - | -, -, -, -, -, -, -, -
644+ |===
645+
646+ `-` means that part are not used, and the value can be anything.
647+
648+ ====
649+
650+ NOTE: In a single compilation unit, different functions may use different
651+ ABI_VLEN values. This means that ABI_VLEN is not uniform across the entire unit,
652+ allowing for function-specific optimization. However, this necessitates that
653+ users ensure consistency in ABI_VLEN between calling and called functions. It
654+ is the user's responsibility to verify that the ABI_VLEN matches on both sides
655+ of a function call to ensure correct operation and data handling.
656+
455657=== ILP32E Calling Convention
456658
457659IMPORTANT: RV32E is not a ratified base ISA and so we cannot guarantee the
0 commit comments