Standard Fixed-length Vector Calling Convention Variant #418

kito-cheng · 2024-01-04T09:32:09Z

This proposal outlines a new variant of the calling convention specifically designed for fixed-length vectors. The primary aim of this variant is to facilitate the passing of fixed-length vectors through vector registers. This approach is derived from the standard vector calling convention, it uses the same register conventions and argument passing and return value rules.

A key aspect of this variant is the introduction of ABI_VLEN, which denotes the width of a vector register within this convention. The ABI_VLEN is constrained to be no wider than the ISA's VLEN (Vector Length), ensuring compatibility while allowing for flexibility in different implementations. This parameter can be configured via compiler command line options or through function attributes in source code.

The document recommends setting the default ABI_VLEN to 128 bits, acknowledging it as a common minimal requirement while allowing the flexibility for lower VLEN (32 or 64 bits) as permitted by the ISA. This flexibility is crucial for optimizing the utilization of longer VLENs in various cores.

The proposal specifies how fixed-length vector arguments are passed based on their size relative to ABI_VLEN. Vectors smaller than ABI_VLEN are passed in a single vector argument register, while larger vectors are passed in multiple registers, following the LMUL (Length Multiplier) pattern of 2, 4, or 8, depending on their size.

Additionally, the proposal addresses the handling of structs and unions containing fixed-length vectors. Structs with members that are all fixed-length vectors follow the vector tuple type rules if they conform to size constraints. In contrast, unions with fixed-length vectors adhere to the integer calling convention.

kito-cheng · 2024-01-04T09:32:32Z

cc. @palmer-dabbelt @JeffreyALaw @preames @topperc @rofirrim @lhtin

riscv-cc.adoc

sorear · 2024-01-09T12:38:24Z

riscv-cc.adoc

 Floating-point registers fs0-fs11 shall be preserved across procedure calls,
 provided they hold values no more than ABI_FLEN bits wide.

+=== Standard Fixed-length Vector Calling Convention Variant


The variant itself seems fine, modulo nits, but how are we planning to enable it?

If it's automatically used by -march=rva23 -mabi=ilp32d that will create major compatibility issues for binary distributions that use a fixed ABI and allow mixing packages at different architecture levels (either as an explicit user action, or as an implementation detail when rebuilding the distribution to change the architecture requirement).

If a new -mabi= value is required to enable use of the variant, it will be usable on closed systems where all packages are built at once, but not on binary distributions, since there is no expectation that binary code built with different -mabi= options is interoperable at all. This will include Debian and Alpine and might include Android and Fedora if their ABIs are finalized prior to the acceptance of this PR.

If it's enabled on a per-function basis using an attribute, or automatically for functions not visible across DSO boundaries, then it's effectively part of the definition of the attribute or a compiler implementation detail and may belong in riscv-c-api-doc or gccint, not here.

My expectation is that should be enabled by per-function basis by attribute, and I think that should have a riscv-c-api-doc PR for that, will send that in the next few days.

riscv-cc.adoc

kito-cheng · 2024-01-26T10:02:46Z

ChangeLog:

Reorder rule.
Pass struct as tuple-type in register only when vector arg reg is
enough, otherwise passed in reference.
Add NOTE for describe what if ABI_VLEN is smaller than VLEN, also
come with an example.
Add NOTE for describe different functions may use different
ABI_VLEN values.

kito-cheng · 2024-01-29T08:49:02Z

ChangeLog:

Add rule for single fixed-length vector or fixed-length vector array with size 1.
Add rule for zero-length fixed-length arrays.
Add explicitly rule for fixed-length vector struct as vector tuple type: pass by ref if no enough arg register.

…n variant Fixed-length vector are passed via general purposed register or memory within current ABI design, we proposed a standard fixed-length vector calling convention variant for passing the fixed-length vector via vector register. This is the syntax part in the proposal, further detail for that calling convention variant see riscv-non-isa/riscv-elf-psabi-doc#418

kito-cheng · 2024-02-02T09:21:48Z

Proposal for function attribute syntax: riscv-non-isa/riscv-c-api-doc#68

riscv-cc.adoc

This patch adds a function attribute `riscv_vls_cc` for RISCV VLS calling convention which takes 0 or 1 argument, the argument is the `ABI_VLEN` which is the `VLEN` for passing the fixed-vector arguments, it wraps the argument as a scalable vector(VLA) using the `ABI_VLEN` and uses the corresponding mechanism to handle it. The range of `ABI_VLEN` is [32, 65536], if not specified, the default value is 128. Here is an example of VLS argument passing: Non-VLS call: ``` void original_call(__attribute__((vector_size(16))) int arg) {} => define void @original_call(i128 noundef %arg) { entry: ... ret void } ``` VLS call: ``` void __attribute__((riscv_vls_cc(256))) vls_call(__attribute__((vector_size(16))) int arg) {} => define riscv_vls_cc void @vls_call(<vscale x 1 x i32> %arg) { entry: ... ret void } } ``` The first Non-VLS call passes generic vector argument of 16 bytes by flattened integer. On the contrary, the VLS call uses `ABI_VLEN=256` which wraps the vector to <vscale x 1 x i32> where the number of scalable vector elements is calaulated by: `ORIG_ELTS * RVV_BITS_PER_BLOCK / ABI_VLEN`. Note: ORIG_ELTS = Vector Size / Type Size = 128 / 32 = 4. PsABI PR: riscv-non-isa/riscv-elf-psabi-doc#418 C-API PR: riscv-non-isa/riscv-c-api-doc#68

markschimmel · 2025-07-18T18:43:11Z

If you change anything regarding structs then how can it be specified per function? Perhaps leave anything related to structs out of the proposal.

topperc · 2025-07-18T23:38:25Z

If you change anything regarding structs then how can it be specified per function? Perhaps leave anything related to structs out of the proposal.

It only changes how structs are passed and returned for that function. The attribute must be on the prototype so the caller will know the ABI for the call. What's your concern?

kito-cheng · 2025-08-14T13:37:22Z

GCC PoC is here, gonna to summit to upstream for review in next few day:
https://github.com/kito-cheng/gcc/tree/kitoc/vls-cc

kito-cheng · 2025-10-09T13:47:05Z

Proposed patchset for upstream GCC:
https://patchwork.sourceware.org/project/gcc/list/?series=52829

xypron · 2025-10-09T14:36:05Z

This suggestion seems to leave a lot of performance on the table for systems that have vlen >= ABI_VLEN.

A vlen=512 system which has a result r1,r2,r3,r4,r5,r6,r7,r8 in register v1 when calling would have to rearrange the data to

v1: r1, r2, 0, 0, 0, 0, 0
v2: r3, r4, 0, 0, 0, 0, 0
v3: r5, r6, 0, 0, 0, 0, 0
v4: r7, r8, 0, 0, 0, 0, 0

before calling a function and in the called function rearrange it to

v1: r1,r2,r3,r4,r5,r6,r7,r8

These rearrangement should be avoided.

Please, leave the data in the compact form when calling into other functions.

This proposal outlines a new variant of the calling convention specifically designed for fixed-length vectors. The primary aim of this variant is to facilitate the passing of fixed-length vectors through vector registers. This approach is derived from the standard vector calling convention, it uses the same register conventions and argument passing and return value rules. A key aspect of this variant is the introduction of ABI_VLEN, which denotes the width of a vector register within this convention. The ABI_VLEN is constrained to be no wider than the ISA's VLEN (Vector Length), ensuring compatibility while allowing for flexibility in different implementations. This parameter can be configured via compiler command line options or through function attributes in source code. The document recommends setting the default ABI_VLEN to 128 bits, acknowledging it as a common minimal requirement while allowing the flexibility for lower VLEN (32 or 64 bits) as permitted by the ISA. This flexibility is crucial for optimizing the utilization of longer VLENs in various cores. The proposal specifies how fixed-length vector arguments are passed based on their size relative to ABI_VLEN. Vectors smaller than ABI_VLEN are passed in a single vector argument register, while larger vectors are passed in multiple registers, following the LMUL (Length Multiplier) pattern of 2, 4, or 8, depending on their size. Additionally, the proposal addresses the handling of structs and unions containing fixed-length vectors. Structs with members that are all fixed-length vectors follow the vector tuple type rules if they conform to size constraints. In contrast, unions with fixed-length vectors adhere to the integer calling convention.

- Reorder rule. - Pass struct as tuple-type in register only when vector arg reg is enough, otherwise passed in reference. - Add NOTE for describe what if ABI_VLEN is smaller than VLEN, also come with an example. - Add NOTE for describe different functions may use different ABI_VLEN values.

- Add rule for single fixed-length vector or fixed-length vector array with size 1. - Add rule for zero-length fixed-length arrays. - Add explicitly rule for fixed-length vector struct as vector tuple type: pass by ref if no enough arg register.

Co-authored-by: Brandon Wu <[email protected]> Signed-off-by: Kito Cheng <[email protected]>

Signed-off-by: Kito Cheng <[email protected]>

Co-authored-by: Craig Topper <[email protected]> Signed-off-by: Kito Cheng <[email protected]>

…t 32 bits

riscv-cc.adoc

xypron · 2025-10-13T09:46:20Z

riscv-cc.adoc

+| 64   | a, b                    | c, d		          | -, -                   | -, -
+| 128  | a, b, c, d              | -, -, -, -             | -, -, -, -             | -, -, -, -
+| 256  | a, b, c, d, -, -, -, -  | -, -, -, -, -, -, -, - | -, -, -, -, -, -, -, - | -, -, -, -, -, -, -, -
+|===


What would the example look like for 512 bits of values and vlen=256?
Adding an example for this would make it clear if you are only using 128 bits per register or the full vlen.

Added 834023c

NOTE: I choose int64x8_t instead of int32x16_t because int32x16_t will need to wrap in the PDF layout, which may cause unnecessary misreading.

riscv-abi.pdf

xypron · 2025-10-13T13:39:01Z

riscv-cc.adoc

+====
+When ABI_VLEN is smaller than the VLEN, the number of vector argument
+registers utilized remains unchanged. However, in such cases, values are only
+placed in a portion of these vector argument registers, corresponding to the


I don't understand why you would only use a portion of the vector registers.
This will require rearranging data by the caller and the callee.
Instead you could leave vector registers unused which would be much more efficient.

I don't think that we should keep the current design.

The ABI_VLEN should simply provide the maximum of bits that can be exchanged between the caller and the callee via registers.

Arguments smaller than or equal to ABI_VLEN should be passed by up to e.g. 4 registers.
Arguments larger than ABI_VLEN should be passed by stack.

But data in vector registers should always be placed as compact as possible.

I am not sure why that will trigger data rearranging? Do you mind give an example for that?

Give some practical example, so that we can discussed with some concrete case :

typedef signed long long __attribute__( ( vector_size( 64 ) ) ) int64x8_t; __attribute__((riscv_vls_cc(128))) int64x8_t foo (int64x8_t a, int64x8_t b); // Return in v8-v11 since 512 bits use LMUL=4 and will occupy 4 registers // Pass a in v8-v11 since 512 bits use LMUL=4 and will occupy 4 registers // Pass b in v12-v15 since 512 bits use LMUL=4 and will occupy 4 registers

// Compile with -march=rv64gcv_zvl512b void bar() { // Assume a assigned to v8 int64x8_t a = {1, 2, 3, 4, 5, 6, 7, 8}; // Assume b assigned to v9 int64x8_t b = {1, 2, 3, 4, 5, 6, 7, 8}; // Pass a to foo, although it occupy 4 register according the ABI_VLEN // But we can still pass that without a without rearranging // So v9-v11 is leaving unset // Move b to v12 due to ABI requirement, this can be optimized // by register allocator in general // v13-v15 is leaving unset a = foo (a, b); }

// Compile with -march=rv64gcv (VLEN=128) void bar() { // Assume a assigned to v8-v11 int64x8_t a = {1, 2, 3, 4, 5, 6, 7, 8}; // Assume b assigned to v12-v15 int64x8_t b = {1, 2, 3, 4, 5, 6, 7, 8}; // Pass a to foo in v8-v11 // Pass b to foo in v12-v15 a = foo (a, b); }

In foo, that will use operate vector operation with VL=4 and LMUL=4, so that could ensure that got same result on different VLEN machine.

godbolt for that: https://godbolt.org/z/cYhPToz7c

Looking at the assembler code everything looks fine.

"the number of vector argument registers utilized remains unchanged" is a bit misleading as in the table below you essentially indicate that some registers may remain unused "-,-,-,-," depending on the machine size.

kito-cheng requested review from asb, aswaterman, cmuellner and jrtc27 January 4, 2024 09:32

kito-cheng requested a review from lhtin January 4, 2024 09:57

topperc reviewed Jan 8, 2024

View reviewed changes

riscv-cc.adoc Show resolved Hide resolved

lhtin reviewed Jan 9, 2024

View reviewed changes

riscv-cc.adoc Outdated Show resolved Hide resolved

riscv-cc.adoc Show resolved Hide resolved

sorear reviewed Jan 9, 2024

View reviewed changes

kito-cheng force-pushed the fixed-length-vector-cc branch from b9f0bc2 to 2386a73 Compare January 26, 2024 10:02

kito-cheng mentioned this pull request Feb 2, 2024

Function attribute for standard fixed-length vector calling convention variant riscv-non-isa/riscv-c-api-doc#68

Open

4vtomat reviewed May 5, 2024

View reviewed changes

riscv-cc.adoc Outdated Show resolved Hide resolved

4vtomat reviewed May 5, 2024

View reviewed changes

riscv-cc.adoc Outdated Show resolved Hide resolved

4vtomat reviewed May 5, 2024

View reviewed changes

riscv-cc.adoc Outdated Show resolved Hide resolved

4vtomat reviewed May 5, 2024

View reviewed changes

riscv-cc.adoc Outdated Show resolved Hide resolved

4vtomat reviewed May 5, 2024

View reviewed changes

riscv-cc.adoc Outdated Show resolved Hide resolved

4vtomat reviewed May 5, 2024

View reviewed changes

riscv-cc.adoc Show resolved Hide resolved

4vtomat mentioned this pull request Jul 24, 2024

[RISCV][VLS] Support RISCV VLS calling convention llvm/llvm-project#100346

Merged

lukel97 reviewed Jul 31, 2024

View reviewed changes

riscv-cc.adoc Outdated Show resolved Hide resolved

kito-cheng force-pushed the fixed-length-vector-cc branch from c84ebb1 to 7e9d68c Compare September 5, 2024 09:42

lukel97 reviewed Sep 5, 2024

View reviewed changes

riscv-cc.adoc Outdated Show resolved Hide resolved

kito-cheng mentioned this pull request Oct 30, 2024

Public Review: Need for whole register (unpredicated) load/stores to facilitate compilers load/store elimination riscv-non-isa/rvv-intrinsic-doc#378

Open

topperc reviewed Dec 2, 2024

View reviewed changes

riscv-cc.adoc Show resolved Hide resolved

riscv-cc.adoc Outdated Show resolved Hide resolved

riscv-cc.adoc Outdated Show resolved Hide resolved

riscv-cc.adoc Outdated Show resolved Hide resolved

kito-cheng and others added 18 commits October 13, 2025 14:18

Minor revision

10d6df0

- Add rule for single fixed-length vector or fixed-length vector array with size 1. - Add rule for zero-length fixed-length arrays. - Add explicitly rule for fixed-length vector struct as vector tuple type: pass by ref if no enough arg register.

Apply suggestions from code review

5c04563

Co-authored-by: Brandon Wu <[email protected]> Signed-off-by: Kito Cheng <[email protected]>

Name Mangling for Standard Calling Convention Variant

33e4283

Name mangling for standard fixed-length vector calling convention

d91d377

Tweak wording per Luke's suggesion

2a1488a

Minor tweak

f3b2982

More descprtion for ABI_VLEN

dfb9d27

Add note to vector type with unsupported element type

e898ce8

Add rule for non-power-of-2 vector

023b4ed

Update riscv-cc.adoc

7fb68b6

Signed-off-by: Kito Cheng <[email protected]>

Apply suggestions from Craig's review

47367a9

Co-authored-by: Craig Topper <[email protected]> Signed-off-by: Kito Cheng <[email protected]>

Address Craig's comment

7d62d64

Encoding VLEN to ABI tag name

5d6bd01

Fix NOTE layout

20478e4

Adding example for int32x4_t layout on different VLEN with ABI_VLEN a…

cf1fea5

…t 32 bits

Fix ref of Fixed-Length Vector

664c5d5

kito-cheng force-pushed the fixed-length-vector-cc branch from 7dbeb4b to 664c5d5 Compare October 13, 2025 06:36

fix typo in ABI Tag name for calling convention variants

b670db9

xypron reviewed Oct 13, 2025

View reviewed changes

riscv-cc.adoc Outdated Show resolved Hide resolved

xypron reviewed Oct 13, 2025

View reviewed changes

kito-cheng added 2 commits October 13, 2025 20:42

Tidy up the table for int32x4_t layout

56bfc37

Adding more example layout for VLS CC

834023c

xypron suggested changes Oct 13, 2025

View reviewed changes

Standard Fixed-length Vector Calling Convention Variant #418

Are you sure you want to change the base?

Standard Fixed-length Vector Calling Convention Variant #418

Uh oh!

Conversation

kito-cheng commented Jan 4, 2024

Uh oh!

kito-cheng commented Jan 4, 2024

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

kito-cheng commented Jan 26, 2024

Uh oh!

kito-cheng commented Jan 29, 2024

Uh oh!

kito-cheng commented Feb 2, 2024

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

markschimmel commented Jul 18, 2025

Uh oh!

topperc commented Jul 18, 2025

Uh oh!

kito-cheng commented Aug 14, 2025

Uh oh!

kito-cheng commented Oct 9, 2025

Uh oh!

xypron commented Oct 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kito-cheng Oct 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

8 participants

xypron commented Oct 9, 2025 •

edited

Loading

kito-cheng Oct 13, 2025 •

edited

Loading