|
| 1 | +- Feature Name: N/A |
| 2 | +- Start Date: 2015-12-18 |
| 3 | +- RFC PR: (leave this empty) |
| 4 | +- Rust Issue: (leave this empty) |
| 5 | + |
| 6 | +# Summary |
| 7 | +[summary]: #summary |
| 8 | + |
| 9 | +Deprecate type aliases and structs in `std::os::$platform::raw` in favor of |
| 10 | +trait-based accessors which return Rust types rather than the equivalent C type |
| 11 | +aliases. |
| 12 | + |
| 13 | +# Motivation |
| 14 | +[motivation]: #motivation |
| 15 | + |
| 16 | +[RFC 517][io-reform] set forth a vision for the `raw` modules in the standard |
| 17 | +library to perform lowering operations on various Rust types to their platform |
| 18 | +equivalents. For example the `fs::Metadata` structure can be lowered to the |
| 19 | +underlying `sys::stat` structure. The rationale for this was to enable building |
| 20 | +abstractions externally from the standard library by exposing all of the |
| 21 | +underlying data that is obtained from the OS. |
| 22 | + |
| 23 | +[io-reform]: https://github.com/rust-lang/rfcs/blob/master/text/0517-io-os-reform.md |
| 24 | + |
| 25 | +This strategy, however, runs into a few problems: |
| 26 | + |
| 27 | +* For some libc structures, such as `stat`, there's not actually one canonical |
| 28 | + definition. For example on 32-bit Linux the definition of `stat` will change |
| 29 | + depending on whether [LFS][lfs] is enabled (via the `-D_FILE_OFFSET_BITS` |
| 30 | + macro). This means that if std is advertises these `raw` types as being "FFI |
| 31 | + compatible with libc", it's not actually correct in all circumstances! |
| 32 | +* Intricately exporting raw underlying interfaces (such as [`&stat` from |
| 33 | + `&fs::Metadata`][std-as-stat]) makes it difficult to change the |
| 34 | + implementation over time. Today the 32-bit Linux standard library [doesn't |
| 35 | + use LFS functions][std-no-lfs], so files over 4GB cannot be opened. Changing |
| 36 | + this, however, would [involve changing the `stat` |
| 37 | + structure][libc-stat-change] and may be difficult to do. |
| 38 | +* Trait extensions in the `raw` module attempt to return the `libc` aliased type |
| 39 | + on all platforms, for example [`DirEntryExt::ino`][std-nio] returns a type of |
| 40 | + `ino_t`. The `ino_t` type is billed as being FFI compatible with the libc |
| 41 | + `ino_t` type, but not all platforms store the `d_ino` field in `dirent` with |
| 42 | + the `ino_t` type. For example on Android the [definition of |
| 43 | + `ino_t`][android-ino_t] is `u32` but the [actual stored value is |
| 44 | + `u64`][android-d_ino]. This means that on Android we're actually silently |
| 45 | + truncating the return value! |
| 46 | + |
| 47 | +[lfs]: http://users.suse.com/~aj/linux_lfs.html |
| 48 | +[std-as-stat]: https://github.com/rust-lang/rust/blob/29ea4eef9fa6e36f40bc1f31eb1e56bf5941ee72/src/libstd/sys/unix/fs.rs#L81-L92 |
| 49 | +[std-no-lfs]: https://github.com/rust-lang/rust/issues/30050 |
| 50 | +[std-ino]: https://github.com/rust-lang/rust/blob/29ea4eef9fa6e36f40bc1f31eb1e56bf5941ee72/src/libstd/sys/unix/fs.rs#L192-L197 |
| 51 | +[libc-stat-change]: https://github.com/rust-lang-nursery/libc/blob/2c7e08c959e599ca221581b1670a9ecbbeac2dcb/src/unix/notbsd/linux/other/b32/mod.rs#L28-L71 |
| 52 | +[android-d_ino]: https://github.com/rust-lang-nursery/libc/blob/2c7e08c959e599ca221581b1670a9ecbbeac2dcb/src/unix/notbsd/android/mod.rs#L50 |
| 53 | +[android-ino_t]: https://github.com/rust-lang-nursery/libc/blob/2c7e08c959e599ca221581b1670a9ecbbeac2dcb/src/unix/notbsd/android/mod.rs#L11 |
| 54 | + |
| 55 | +Over time it's basically turned out that exporting the somewhat-messy details of |
| 56 | +libc has gotten a little messy in the standard library as well. Exporting this |
| 57 | +functionality (e.g. being able to access all of the fields), is quite useful |
| 58 | +however! This RFC proposes tweaking the design of the extensions in |
| 59 | +`std::os::*::raw` to allow the same level of information exposure that happens |
| 60 | +today but also cut some of the tie from libc to std to give us more freedom to |
| 61 | +change these implementation details and work around weird platforms. |
| 62 | + |
| 63 | +# Detailed design |
| 64 | +[design]: #detailed-design |
| 65 | + |
| 66 | +First, the types and type aliases in `std::os::*::raw` will all be |
| 67 | +deprecated. For example `stat`, `ino_t`, `dev_t`, `mode_t`, etc, will all be |
| 68 | +deprecated (in favor of their definitions in the `libc` crate). Note that the C |
| 69 | +integer types, `c_int` and friends, will not be deprecated. |
| 70 | + |
| 71 | +Next, all existing extension traits will cease to return platform specific type |
| 72 | +aliases (such as the `DirEntryExt::ino` function). Instead they will return |
| 73 | +`u64` across the board unless it's 100% known for sure that fewer bits will |
| 74 | +suffice. This will improve consistency across platforms as well as avoid |
| 75 | +truncation problems such as those Android is experiencing. Furthermore this |
| 76 | +frees std from dealing with any odd FFI compatibility issues, punting that to |
| 77 | +the libc crate itself it the values are handed back into C. |
| 78 | + |
| 79 | +The `std::os::*::fs::MetadataExt` will have its `as_raw_stat` method deprecated, |
| 80 | +and it will instead grow functions to access all the associated fields of the |
| 81 | +underlying `stat` structure. This means that there will now be a |
| 82 | +trait-per-platform to expose all this information. Also note that all the |
| 83 | +methods will likely return `u64` in accordance with the above modification. |
| 84 | + |
| 85 | +With these modifications to what `std::os::*::raw` includes and how it's |
| 86 | +defined, it should be easy to tweak existing implementations and ensure values |
| 87 | +are transmitted in a lossless fashion. The changes, however, are both breaking |
| 88 | +changes and don't immediately enable fixing bugs like using LFS on Linux: |
| 89 | + |
| 90 | +* Code such as `let a: ino_t = entry.ino()` would break as the `ino()` function |
| 91 | + will return `u64`, but the definition of `ino_t` may not be `u64` for all |
| 92 | + platforms. |
| 93 | +* The `stat` structure itself on 32-bit Linux still uses 32-bit fields (e.g. it |
| 94 | + doesn't mirror `stat64` in libc). |
| 95 | + |
| 96 | +To help with these issues, more extensive modifications can be made to the |
| 97 | +platform specific modules. All type aliases can be switched over to `u64` and |
| 98 | +the `stat` structure could simply be redefined to `stat64` on Linux (minus |
| 99 | +keeping the same name). This would, however, explicitly mean that |
| 100 | +**std::os::raw is no longer FFI compatible with C**. |
| 101 | + |
| 102 | +This breakage can be clearly indicated in the deprecation messages, however. |
| 103 | +Additionally, this fits within std's [breaking changes policy][api-evolution] as |
| 104 | +a local `as` cast should be all that's needed to patch code that breaks to |
| 105 | +straddle versions of Rust. |
| 106 | + |
| 107 | +[api-evolution]: https://github.com/rust-lang/rfcs/blob/master/text/1105-api-evolution.md |
| 108 | + |
| 109 | +# Drawbacks |
| 110 | +[drawbacks]: #drawbacks |
| 111 | + |
| 112 | +As mentioned above, this RFC is strictly-speaking a breaking change. It is |
| 113 | +expected that not much code will break, but currently there is no data |
| 114 | +supporting this. |
| 115 | + |
| 116 | +Returning `u64` across the board could be confusing in some circumstances as it |
| 117 | +may wildly differ both in terms of signedness as well as size from the |
| 118 | +underlying C type. Converting it back to the appropriate type runs the risk of |
| 119 | +being onerous, but accessing these raw fields in theory happens quite rarely as |
| 120 | +std should primarily be exporting cross-platform accessors for the various |
| 121 | +fields here and there. |
| 122 | + |
| 123 | +# Alternatives |
| 124 | +[alternatives]: #alternatives |
| 125 | + |
| 126 | +* The documentation of the raw modules in std could be modified to indicate that |
| 127 | + the types contained within are intentionally not FFI compatible, and the same |
| 128 | + structure could be preserved today with the types all being rewritten to what |
| 129 | + they would be anyway if this RFC were implemented. For example `ino_t` on |
| 130 | + Android would change to `u64` and `stat` on 32-bit Linux would change to |
| 131 | + `stat64`. In doing this, however, it's not clear why we'd keep around all the |
| 132 | + C namings and structure. |
| 133 | + |
| 134 | +* Instead of breaking existing functionality, new accessors and types could be |
| 135 | + added to acquire the "lossless" version of a type. For example we could add a |
| 136 | + `ino64` function on `DirEntryExt` which returns a `u64`, and for `stat` we |
| 137 | + could add `as_raw_stat64`. This would, however, force `Metadata` to store two |
| 138 | + different `stat` structures, and the breakage in practice this will cause may |
| 139 | + be small enough to not warrant these great lengths. |
| 140 | + |
| 141 | +# Unresolved questions |
| 142 | +[unresolved]: #unresolved-questions |
| 143 | + |
| 144 | +* Is the policy of almost always returning `u64` too strict? Should types like |
| 145 | + `mode_t` be allowed as `i32` explicitly? Should the sign at least attempt to |
| 146 | + always be preserved? |
0 commit comments