Skip to content

Commit 09e34be

Browse files
authored
Merge pull request #2137 from joshtriplett/variadic
Support defining C-compatible variadic functions in Rust
2 parents aa80b68 + eb9c392 commit 09e34be

File tree

1 file changed

+265
-0
lines changed

1 file changed

+265
-0
lines changed

text/2137-variadic.md

+265
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,265 @@
1+
- Feature Name: variadic
2+
- Start Date: 2017-08-21
3+
- RFC PR: https://github.com/rust-lang/rfcs/pull/2137
4+
- Rust Issue: https://github.com/rust-lang/rust/issues/44930
5+
6+
# Summary
7+
[summary]: #summary
8+
9+
Support defining C-compatible variadic functions in Rust, via new intrinsics.
10+
Rust currently supports declaring external variadic functions and calling them
11+
from unsafe code, but does not support writing such functions directly in Rust.
12+
Adding such support will allow Rust to replace a larger variety of C libraries,
13+
avoid requiring C stubs and error-prone reimplementation of platform-specific
14+
code, improve incremental translation of C codebases to Rust, and allow
15+
implementation of variadic callbacks.
16+
17+
# Motivation
18+
[motivation]: #motivation
19+
20+
Rust can currently call any possible C interface, and export *almost* any
21+
interface for C to call. Variadic functions represent one of the last remaining
22+
gaps in the latter. Currently, providing a variadic function callable from C
23+
requires writing a stub function in C, linking that function into the Rust
24+
program, and arranging for that stub to subsequently call into Rust.
25+
Furthermore, even with the arguments packaged into a `va_list` structure by C
26+
code, extracting arguments from that structure requires exceptionally
27+
error-prone, platform-specific code, for which the crates.io ecosystem provides
28+
only partial solutions for a few target architectures.
29+
30+
This RFC does not propose an interface intended for native Rust code to pass
31+
variable numbers of arguments to a native Rust function, nor an interface that
32+
provides any kind of type safety. This proposal exists primarily to allow Rust
33+
to provide interfaces callable from C code.
34+
35+
# Guide-level explanation
36+
[guide-level-explanation]: #guide-level-explanation
37+
38+
C code allows declaring a function callable with a variable number of
39+
arguments, using an ellipsis (`...`) at the end of the argument list. For
40+
compatibility, unsafe Rust code may export a function compatible with this
41+
mechanism.
42+
43+
Such a declaration looks like this:
44+
45+
```rust
46+
pub unsafe extern "C" fn func(arg: T, arg2: T2, mut args: ...) {
47+
// implementation
48+
}
49+
```
50+
51+
The use of `...` as the type of `args` at the end of the argument list declares
52+
the function as variadic. This must appear as the last argument of the
53+
function, and the function must have at least one argument before it. The
54+
function must use `extern "C"`, and must use `unsafe`. To expose such a
55+
function as a symbol for C code to call directly, the function may want to use
56+
`#[no_mangle]` as well; however, Rust code may also pass the function to C code
57+
expecting a function pointer to a variadic function.
58+
59+
The `args` named in the function declaration has the type
60+
`core::intrinsics::VaList<'a>`, where the compiler supplies a lifetime `'a`
61+
that prevents the arguments from outliving the variadic function.
62+
63+
To access the arguments, Rust provides the following public interfaces in
64+
`core::intrinsics` (also available via `std::intrinsics`):
65+
66+
```rust
67+
/// The argument list of a C-compatible variadic function, corresponding to the
68+
/// underlying C `va_list`. Opaque.
69+
pub struct VaList<'a> { /* fields omitted */ }
70+
71+
// Note: the lifetime on VaList is invariant
72+
impl<'a> VaList<'a> {
73+
/// Extract the next argument from the argument list. T must have a type
74+
/// usable in an FFI interface.
75+
pub unsafe fn arg<T>(&mut self) -> T;
76+
77+
/// Copy the argument list. Destroys the copy after the closure returns.
78+
pub fn copy<'ret, F, T>(&self, F) -> T
79+
where
80+
F: for<'copy> FnOnce(VaList<'copy>) -> T, T: 'ret;
81+
}
82+
```
83+
84+
The type returned from `VaList::arg` must have a type usable in an `extern "C"`
85+
FFI interface; the compiler allows all the same types returned from
86+
`VaList::arg` that it allows in the function signature of an `extern "C"`
87+
function.
88+
89+
All of the corresponding C integer and float types defined in the `libc` crate
90+
consist of aliases for the underlying Rust types, so `VaList::arg` can also
91+
extract those types.
92+
93+
Note that extracting an argument from a `VaList` follows the C rules for
94+
argument passing and promotion. In particular, C code will promote any argument
95+
smaller than a C `int` to an `int`, and promote `float` to `double`. Thus,
96+
Rust's argument extractions for the corresponding types will extract an `int`
97+
or `double` as appropriate, and convert appropriately.
98+
99+
Like the underlying platform `va_list` structure in C, `VaList` has an opaque,
100+
platform-specific representation.
101+
102+
A variadic function may pass the `VaList` to another function. However, the
103+
lifetime attached to the `VaList` will prevent the variadic function from
104+
returning the `VaList` or otherwise allowing it to outlive that call to the
105+
variadic function. Similarly, the closure called by `copy` cannot return the
106+
`VaList` passed to it or otherwise allow it to outlive the closure.
107+
108+
A function declared with `extern "C"` may accept a `VaList` parameter,
109+
corresponding to a `va_list` parameter in the corresponding C function. For
110+
instance, the `libc` crate could define the `va_list` variants of `printf` as
111+
follows:
112+
113+
```rust
114+
extern "C" {
115+
pub unsafe fn vprintf(format: *const c_char, ap: VaList) -> c_int;
116+
pub unsafe fn vfprintf(stream: *mut FILE, format: *const c_char, ap: VaList) -> c_int;
117+
pub unsafe fn vsprintf(s: *mut c_char, format: *const c_char, ap: VaList) -> c_int;
118+
pub unsafe fn vsnprintf(s: *mut c_char, n: size_t, format: *const c_char, ap: VaList) -> c_int;
119+
}
120+
```
121+
122+
Note that, per the C semantics, after passing `VaList` to these functions, the
123+
caller can no longer use it, hence the use of the `VaList` type to take
124+
ownership of the object. To continue using the object after a call to these
125+
functions, use `VaList::copy` to pass a copy of it instead.
126+
127+
Conversely, an `unsafe extern "C"` function written in Rust may accept a
128+
`VaList` parameter, to allow implementing the `v` variants of such functions in
129+
Rust. Such a function must not specify the lifetime.
130+
131+
Defining a variadic function, or calling any of these new functions, requires a
132+
feature-gate, `c_variadic`.
133+
134+
Sample Rust code exposing a variadic function:
135+
136+
```rust
137+
#![feature(c_variadic)]
138+
139+
#[no_mangle]
140+
pub unsafe extern "C" fn func(fixed: u32, mut args: ...) {
141+
let x: u8 = args.arg();
142+
let y: u16 = args.arg();
143+
let z: u32 = args.arg();
144+
println!("{} {} {} {}", fixed, x, y, z);
145+
}
146+
```
147+
148+
Sample C code calling that function:
149+
150+
```c
151+
#include <stdint.h>
152+
153+
void func(uint32_t fixed, ...);
154+
155+
int main(void)
156+
{
157+
uint8_t x = 10;
158+
uint16_t y = 15;
159+
uint32_t z = 20;
160+
func(5, x, y, z);
161+
return 0;
162+
}
163+
```
164+
165+
Compiling and linking these two together will produce a program that prints:
166+
167+
```text
168+
5 10 15 20
169+
```
170+
171+
# Reference-level explanation
172+
[reference-level-explanation]: #reference-level-explanation
173+
174+
LLVM already provides a set of intrinsics, implementing `va_start`, `va_arg`,
175+
`va_end`, and `va_copy`. The compiler will insert a call to the `va_start`
176+
intrinsic at the start of the function to provide the `VaList` argument (if
177+
used), and a matching call to the `va_end` intrinsic on any exit from the
178+
function. The implementation of `VaList::arg` will call `va_arg`. The
179+
implementation of `VaList::copy` wil call `va_copy`, and then `va_end` after
180+
the closure exits.
181+
182+
`VaList` may become a language item (`#[lang="VaList"]`) to attach the
183+
appropriate compiler handling.
184+
185+
The compiler may need to handle the type `VaList` specially, in order to
186+
provide the desired parameter-passing semantics at FFI boundaries. In
187+
particular, some platforms define `va_list` as a single-element array, such
188+
that declaring a `va_list` allocates storage, but passing a `va_list` as a
189+
function parameter occurs by pointer. The compiler must arrange to handle both
190+
receiving and passing `VaList` parameters in a manner compatible with the C
191+
ABI.
192+
193+
The C standard requires that the call to `va_end` for a `va_list` occur in the
194+
same function as the matching `va_start` or `va_copy` for that `va_list`. Some
195+
C implementations do not enforce this requirement, allowing for functions that
196+
call `va_end` on a passed-in `va_list` that they did not create. This RFC does
197+
not define a means of implementing or calling non-standard functions like these.
198+
199+
Note that on some platforms, these LLVM intrinsics do not fully implement the
200+
necessary functionality, expecting the invoker of the intrinsic to provide
201+
additional LLVM IR code. On such platforms, rustc will need to provide the
202+
appropriate additional code, just as clang does.
203+
204+
This RFC intentionally does not specify or expose the mechanism used to limit
205+
the use of `VaList::arg` only to specific types. The compiler should provide
206+
errors similar to those associated with passing types through FFI function
207+
calls.
208+
209+
# Drawbacks
210+
[drawbacks]: #drawbacks
211+
212+
This feature is highly unsafe, and requires carefully written code to extract
213+
the appropriate argument types provided by the caller, based on whatever
214+
arbitrary runtime information determines those types. However, in this regard,
215+
this feature provides no more unsafety than the equivalent C code, and in fact
216+
provides several additional safety mechanisms, such as automatic handling of
217+
type promotions, lifetimes, copies, and cleanup.
218+
219+
# Rationale and Alternatives
220+
[alternatives]: #alternatives
221+
222+
This represents one of the few C-compatible interfaces that Rust does not
223+
provide. Currently, Rust code wishing to interoperate with C has no alternative
224+
to this mechanism, other than hand-written C stubs. This also limits the
225+
ability to incrementally translate C to Rust, or to bind to C interfaces that
226+
expect variadic callbacks.
227+
228+
Rather than having the compiler invent an appropriate lifetime parameter, we
229+
could simply require the unsafe code implementing a variadic function to avoid
230+
ever allowing the `VaList` structure to outlive it. However, if we can provide
231+
an appropriate compile-time lifetime check, doing would make it easier to
232+
correctly write the appropriate unsafe code.
233+
234+
Rather than naming the argument in the variadic function signature, we could
235+
provide a `VaList::start` function to return one. This would also allow calling
236+
`start` more than once. However, this would complicate the lifetime handling
237+
required to ensure that the `VaList` does not outlive the call to the variadic
238+
function.
239+
240+
We could use several alternative syntaxes to declare the argument in the
241+
signature, including `...args`, or listing the `VaList` or `VaList<'a>` type
242+
explicitly. The latter, however, would require care to ensure that code could
243+
not reference or alias the lifetime.
244+
245+
# Unresolved questions
246+
[unresolved]: #unresolved-questions
247+
248+
When implementing this feature, we will need to determine whether the compiler
249+
can provide an appropriate lifetime that prevents a `VaList` from outliving its
250+
corresponding variadic function.
251+
252+
Currently, Rust does not allow passing a closure to C code expecting a pointer
253+
to an `extern "C"` function. If this becomes possible in the future, then
254+
variadic closures would become useful, and we should add them at that time.
255+
256+
This RFC only supports the platform's native `"C"` ABI, not any other ABI. Code
257+
may wish to define variadic functions for another ABI, and potentially more
258+
than one such ABI in the same program. However, such support should not
259+
complicate the common case. LLVM has extremely limited support for this, for
260+
only a specific pair of platforms (supporting the Windows ABI on platforms that
261+
use the System V ABI), with no generalized support in the underlying
262+
intrinsics. The LLVM intrinsics only support using the ABI of the containing
263+
function. Given the current state of the ecosystem, this RFC only proposes
264+
supporting the native `"C"` ABI for now. Doing so will not prevent the
265+
introduction of support for non-native ABIs in the future.

0 commit comments

Comments
 (0)