|
| 1 | +- Feature Name: variadic |
| 2 | +- Start Date: 2017-08-21 |
| 3 | +- RFC PR: https://github.com/rust-lang/rfcs/pull/2137 |
| 4 | +- Rust Issue: https://github.com/rust-lang/rust/issues/44930 |
| 5 | + |
| 6 | +# Summary |
| 7 | +[summary]: #summary |
| 8 | + |
| 9 | +Support defining C-compatible variadic functions in Rust, via new intrinsics. |
| 10 | +Rust currently supports declaring external variadic functions and calling them |
| 11 | +from unsafe code, but does not support writing such functions directly in Rust. |
| 12 | +Adding such support will allow Rust to replace a larger variety of C libraries, |
| 13 | +avoid requiring C stubs and error-prone reimplementation of platform-specific |
| 14 | +code, improve incremental translation of C codebases to Rust, and allow |
| 15 | +implementation of variadic callbacks. |
| 16 | + |
| 17 | +# Motivation |
| 18 | +[motivation]: #motivation |
| 19 | + |
| 20 | +Rust can currently call any possible C interface, and export *almost* any |
| 21 | +interface for C to call. Variadic functions represent one of the last remaining |
| 22 | +gaps in the latter. Currently, providing a variadic function callable from C |
| 23 | +requires writing a stub function in C, linking that function into the Rust |
| 24 | +program, and arranging for that stub to subsequently call into Rust. |
| 25 | +Furthermore, even with the arguments packaged into a `va_list` structure by C |
| 26 | +code, extracting arguments from that structure requires exceptionally |
| 27 | +error-prone, platform-specific code, for which the crates.io ecosystem provides |
| 28 | +only partial solutions for a few target architectures. |
| 29 | + |
| 30 | +This RFC does not propose an interface intended for native Rust code to pass |
| 31 | +variable numbers of arguments to a native Rust function, nor an interface that |
| 32 | +provides any kind of type safety. This proposal exists primarily to allow Rust |
| 33 | +to provide interfaces callable from C code. |
| 34 | + |
| 35 | +# Guide-level explanation |
| 36 | +[guide-level-explanation]: #guide-level-explanation |
| 37 | + |
| 38 | +C code allows declaring a function callable with a variable number of |
| 39 | +arguments, using an ellipsis (`...`) at the end of the argument list. For |
| 40 | +compatibility, unsafe Rust code may export a function compatible with this |
| 41 | +mechanism. |
| 42 | + |
| 43 | +Such a declaration looks like this: |
| 44 | + |
| 45 | +```rust |
| 46 | +pub unsafe extern "C" fn func(arg: T, arg2: T2, mut args: ...) { |
| 47 | + // implementation |
| 48 | +} |
| 49 | +``` |
| 50 | + |
| 51 | +The use of `...` as the type of `args` at the end of the argument list declares |
| 52 | +the function as variadic. This must appear as the last argument of the |
| 53 | +function, and the function must have at least one argument before it. The |
| 54 | +function must use `extern "C"`, and must use `unsafe`. To expose such a |
| 55 | +function as a symbol for C code to call directly, the function may want to use |
| 56 | +`#[no_mangle]` as well; however, Rust code may also pass the function to C code |
| 57 | +expecting a function pointer to a variadic function. |
| 58 | + |
| 59 | +The `args` named in the function declaration has the type |
| 60 | +`core::intrinsics::VaList<'a>`, where the compiler supplies a lifetime `'a` |
| 61 | +that prevents the arguments from outliving the variadic function. |
| 62 | + |
| 63 | +To access the arguments, Rust provides the following public interfaces in |
| 64 | +`core::intrinsics` (also available via `std::intrinsics`): |
| 65 | + |
| 66 | +```rust |
| 67 | +/// The argument list of a C-compatible variadic function, corresponding to the |
| 68 | +/// underlying C `va_list`. Opaque. |
| 69 | +pub struct VaList<'a> { /* fields omitted */ } |
| 70 | + |
| 71 | +// Note: the lifetime on VaList is invariant |
| 72 | +impl<'a> VaList<'a> { |
| 73 | + /// Extract the next argument from the argument list. T must have a type |
| 74 | + /// usable in an FFI interface. |
| 75 | + pub unsafe fn arg<T>(&mut self) -> T; |
| 76 | + |
| 77 | + /// Copy the argument list. Destroys the copy after the closure returns. |
| 78 | + pub fn copy<'ret, F, T>(&self, F) -> T |
| 79 | + where |
| 80 | + F: for<'copy> FnOnce(VaList<'copy>) -> T, T: 'ret; |
| 81 | +} |
| 82 | +``` |
| 83 | + |
| 84 | +The type returned from `VaList::arg` must have a type usable in an `extern "C"` |
| 85 | +FFI interface; the compiler allows all the same types returned from |
| 86 | +`VaList::arg` that it allows in the function signature of an `extern "C"` |
| 87 | +function. |
| 88 | + |
| 89 | +All of the corresponding C integer and float types defined in the `libc` crate |
| 90 | +consist of aliases for the underlying Rust types, so `VaList::arg` can also |
| 91 | +extract those types. |
| 92 | + |
| 93 | +Note that extracting an argument from a `VaList` follows the C rules for |
| 94 | +argument passing and promotion. In particular, C code will promote any argument |
| 95 | +smaller than a C `int` to an `int`, and promote `float` to `double`. Thus, |
| 96 | +Rust's argument extractions for the corresponding types will extract an `int` |
| 97 | +or `double` as appropriate, and convert appropriately. |
| 98 | + |
| 99 | +Like the underlying platform `va_list` structure in C, `VaList` has an opaque, |
| 100 | +platform-specific representation. |
| 101 | + |
| 102 | +A variadic function may pass the `VaList` to another function. However, the |
| 103 | +lifetime attached to the `VaList` will prevent the variadic function from |
| 104 | +returning the `VaList` or otherwise allowing it to outlive that call to the |
| 105 | +variadic function. Similarly, the closure called by `copy` cannot return the |
| 106 | +`VaList` passed to it or otherwise allow it to outlive the closure. |
| 107 | + |
| 108 | +A function declared with `extern "C"` may accept a `VaList` parameter, |
| 109 | +corresponding to a `va_list` parameter in the corresponding C function. For |
| 110 | +instance, the `libc` crate could define the `va_list` variants of `printf` as |
| 111 | +follows: |
| 112 | + |
| 113 | +```rust |
| 114 | +extern "C" { |
| 115 | + pub unsafe fn vprintf(format: *const c_char, ap: VaList) -> c_int; |
| 116 | + pub unsafe fn vfprintf(stream: *mut FILE, format: *const c_char, ap: VaList) -> c_int; |
| 117 | + pub unsafe fn vsprintf(s: *mut c_char, format: *const c_char, ap: VaList) -> c_int; |
| 118 | + pub unsafe fn vsnprintf(s: *mut c_char, n: size_t, format: *const c_char, ap: VaList) -> c_int; |
| 119 | +} |
| 120 | +``` |
| 121 | + |
| 122 | +Note that, per the C semantics, after passing `VaList` to these functions, the |
| 123 | +caller can no longer use it, hence the use of the `VaList` type to take |
| 124 | +ownership of the object. To continue using the object after a call to these |
| 125 | +functions, use `VaList::copy` to pass a copy of it instead. |
| 126 | + |
| 127 | +Conversely, an `unsafe extern "C"` function written in Rust may accept a |
| 128 | +`VaList` parameter, to allow implementing the `v` variants of such functions in |
| 129 | +Rust. Such a function must not specify the lifetime. |
| 130 | + |
| 131 | +Defining a variadic function, or calling any of these new functions, requires a |
| 132 | +feature-gate, `c_variadic`. |
| 133 | + |
| 134 | +Sample Rust code exposing a variadic function: |
| 135 | + |
| 136 | +```rust |
| 137 | +#![feature(c_variadic)] |
| 138 | + |
| 139 | +#[no_mangle] |
| 140 | +pub unsafe extern "C" fn func(fixed: u32, mut args: ...) { |
| 141 | + let x: u8 = args.arg(); |
| 142 | + let y: u16 = args.arg(); |
| 143 | + let z: u32 = args.arg(); |
| 144 | + println!("{} {} {} {}", fixed, x, y, z); |
| 145 | +} |
| 146 | +``` |
| 147 | + |
| 148 | +Sample C code calling that function: |
| 149 | + |
| 150 | +```c |
| 151 | +#include <stdint.h> |
| 152 | + |
| 153 | +void func(uint32_t fixed, ...); |
| 154 | + |
| 155 | +int main(void) |
| 156 | +{ |
| 157 | + uint8_t x = 10; |
| 158 | + uint16_t y = 15; |
| 159 | + uint32_t z = 20; |
| 160 | + func(5, x, y, z); |
| 161 | + return 0; |
| 162 | +} |
| 163 | +``` |
| 164 | +
|
| 165 | +Compiling and linking these two together will produce a program that prints: |
| 166 | +
|
| 167 | +```text |
| 168 | +5 10 15 20 |
| 169 | +``` |
| 170 | + |
| 171 | +# Reference-level explanation |
| 172 | +[reference-level-explanation]: #reference-level-explanation |
| 173 | + |
| 174 | +LLVM already provides a set of intrinsics, implementing `va_start`, `va_arg`, |
| 175 | +`va_end`, and `va_copy`. The compiler will insert a call to the `va_start` |
| 176 | +intrinsic at the start of the function to provide the `VaList` argument (if |
| 177 | +used), and a matching call to the `va_end` intrinsic on any exit from the |
| 178 | +function. The implementation of `VaList::arg` will call `va_arg`. The |
| 179 | +implementation of `VaList::copy` wil call `va_copy`, and then `va_end` after |
| 180 | +the closure exits. |
| 181 | + |
| 182 | +`VaList` may become a language item (`#[lang="VaList"]`) to attach the |
| 183 | +appropriate compiler handling. |
| 184 | + |
| 185 | +The compiler may need to handle the type `VaList` specially, in order to |
| 186 | +provide the desired parameter-passing semantics at FFI boundaries. In |
| 187 | +particular, some platforms define `va_list` as a single-element array, such |
| 188 | +that declaring a `va_list` allocates storage, but passing a `va_list` as a |
| 189 | +function parameter occurs by pointer. The compiler must arrange to handle both |
| 190 | +receiving and passing `VaList` parameters in a manner compatible with the C |
| 191 | +ABI. |
| 192 | + |
| 193 | +The C standard requires that the call to `va_end` for a `va_list` occur in the |
| 194 | +same function as the matching `va_start` or `va_copy` for that `va_list`. Some |
| 195 | +C implementations do not enforce this requirement, allowing for functions that |
| 196 | +call `va_end` on a passed-in `va_list` that they did not create. This RFC does |
| 197 | +not define a means of implementing or calling non-standard functions like these. |
| 198 | + |
| 199 | +Note that on some platforms, these LLVM intrinsics do not fully implement the |
| 200 | +necessary functionality, expecting the invoker of the intrinsic to provide |
| 201 | +additional LLVM IR code. On such platforms, rustc will need to provide the |
| 202 | +appropriate additional code, just as clang does. |
| 203 | + |
| 204 | +This RFC intentionally does not specify or expose the mechanism used to limit |
| 205 | +the use of `VaList::arg` only to specific types. The compiler should provide |
| 206 | +errors similar to those associated with passing types through FFI function |
| 207 | +calls. |
| 208 | + |
| 209 | +# Drawbacks |
| 210 | +[drawbacks]: #drawbacks |
| 211 | + |
| 212 | +This feature is highly unsafe, and requires carefully written code to extract |
| 213 | +the appropriate argument types provided by the caller, based on whatever |
| 214 | +arbitrary runtime information determines those types. However, in this regard, |
| 215 | +this feature provides no more unsafety than the equivalent C code, and in fact |
| 216 | +provides several additional safety mechanisms, such as automatic handling of |
| 217 | +type promotions, lifetimes, copies, and cleanup. |
| 218 | + |
| 219 | +# Rationale and Alternatives |
| 220 | +[alternatives]: #alternatives |
| 221 | + |
| 222 | +This represents one of the few C-compatible interfaces that Rust does not |
| 223 | +provide. Currently, Rust code wishing to interoperate with C has no alternative |
| 224 | +to this mechanism, other than hand-written C stubs. This also limits the |
| 225 | +ability to incrementally translate C to Rust, or to bind to C interfaces that |
| 226 | +expect variadic callbacks. |
| 227 | + |
| 228 | +Rather than having the compiler invent an appropriate lifetime parameter, we |
| 229 | +could simply require the unsafe code implementing a variadic function to avoid |
| 230 | +ever allowing the `VaList` structure to outlive it. However, if we can provide |
| 231 | +an appropriate compile-time lifetime check, doing would make it easier to |
| 232 | +correctly write the appropriate unsafe code. |
| 233 | + |
| 234 | +Rather than naming the argument in the variadic function signature, we could |
| 235 | +provide a `VaList::start` function to return one. This would also allow calling |
| 236 | +`start` more than once. However, this would complicate the lifetime handling |
| 237 | +required to ensure that the `VaList` does not outlive the call to the variadic |
| 238 | +function. |
| 239 | + |
| 240 | +We could use several alternative syntaxes to declare the argument in the |
| 241 | +signature, including `...args`, or listing the `VaList` or `VaList<'a>` type |
| 242 | +explicitly. The latter, however, would require care to ensure that code could |
| 243 | +not reference or alias the lifetime. |
| 244 | + |
| 245 | +# Unresolved questions |
| 246 | +[unresolved]: #unresolved-questions |
| 247 | + |
| 248 | +When implementing this feature, we will need to determine whether the compiler |
| 249 | +can provide an appropriate lifetime that prevents a `VaList` from outliving its |
| 250 | +corresponding variadic function. |
| 251 | + |
| 252 | +Currently, Rust does not allow passing a closure to C code expecting a pointer |
| 253 | +to an `extern "C"` function. If this becomes possible in the future, then |
| 254 | +variadic closures would become useful, and we should add them at that time. |
| 255 | + |
| 256 | +This RFC only supports the platform's native `"C"` ABI, not any other ABI. Code |
| 257 | +may wish to define variadic functions for another ABI, and potentially more |
| 258 | +than one such ABI in the same program. However, such support should not |
| 259 | +complicate the common case. LLVM has extremely limited support for this, for |
| 260 | +only a specific pair of platforms (supporting the Windows ABI on platforms that |
| 261 | +use the System V ABI), with no generalized support in the underlying |
| 262 | +intrinsics. The LLVM intrinsics only support using the ABI of the containing |
| 263 | +function. Given the current state of the ecosystem, this RFC only proposes |
| 264 | +supporting the native `"C"` ABI for now. Doing so will not prevent the |
| 265 | +introduction of support for non-native ABIs in the future. |
0 commit comments