diff --git a/README.md b/README.md index 1ce69af..fd0cbb8 100644 --- a/README.md +++ b/README.md @@ -18,6 +18,11 @@ The Rust compiler runs the [MIR](https://rust-lang-nursery.github.io/rustc-guide in the [`MIR` interpreter (miri)](https://rust-lang-nursery.github.io/rustc-guide/const-eval.html), which sort of is a virtual machine using `MIR` as "bytecode". +## Table of Contents + +* [Const Safety](const_safety.md) +* [Promotion](const_safety.md) + ## Related RFCs ### Const Promotion @@ -62,4 +67,4 @@ even if it does not break the compilation of the current crate's dependencies. Some of these features interact. E.g. * `match` + `loop` yields `while` -* `panic!` + `if` + `locals` yields `assert!` \ No newline at end of file +* `panic!` + `if` + `locals` yields `assert!` diff --git a/const_safety.md b/const_safety.md index 835b0b7..ac703c5 100644 --- a/const_safety.md +++ b/const_safety.md @@ -1 +1,135 @@ -# Const safety \ No newline at end of file +# Const safety + +The miri engine, which is used to execute code at compile time, can fail in +four possible ways: + +* The program performs an unsupported operation (e.g., calling an unimplemented + intrinsics, or doing an operation that would observe the integer address of a + pointer). +* The program causes undefined behavior (e.g., dereferencing an out-of-bounds + pointer). +* The program panics (e.g., a failed bounds check). +* The program loops forever, and this is detected by the loop detector. Note + that this detection happens on a best-effort basis only. + +Just like panics and non-termination are acceptable in safe run-time Rust code, +we also consider these acceptable in safe compile-time Rust code. However, we +would like to rule out the first two kinds of failures in safe code. Following +the terminology in [this blog post], we call a program that does not fail in the +first two ways *const safe*. + +[this blog post]: https://www.ralfj.de/blog/2018/07/19/const.html + +The goal of the const safety check, then, is to ensure that a program is const +safe. What makes this tricky is that there are some operations that are safe as +far as run-time Rust is concerned, but unsupported in the miri engine and hence +not const safe (they fall in the first category of failures above). We call these operations *unconst*. The purpose +of the following section is to explain this in more detail, before proceeding +with the main definitions. + +## Miri background + +A very simple example of an unconst operation is +```rust +static S:i32 = 0; +const BAD:bool = (&S as *const i32 as usize) % 16 == 0; +``` +The modulo operation here is not supported by the miri engine because evaluating +it requires knowing the actual integer address of `S`. + +The way miri handles this is by treating pointer and integer values separately. +The most primitive kind of value in miri is a `Scalar`, and a scalar is *either* +a pointer (`Scalar::Ptr`) or a bunch of bits representing an integer +(`Scalar::Bits`). Every value of a variable of primitive type is stored as a +`Scalar`. In the code above, casting the pointer `&S` to `*const i32` and then +to `usize` does not actually change the value -- we end up with a local variable +of type `usize` whose value is a `Scalar::Ptr`. This is not a problem in +itself, but then executing `%` on this *pointer value* is unsupported. + +However, it does not seem appropriate to blame the `%` operation above for this +failure. `%` on "normal" `usize` values (`Scalar::Bits`) is perfectly fine, just using it on +values computed from pointers is an issue. Essentially, `&i32 as *const i32 as +usize` is a "safe" `usize` at run-time (meaning that applying safe operations to +this `usize` cannot lead to misbehavior, following terminology [suggested here]) +-- but the same value is *not* "safe" at compile-time, because we can cause a +const safety violation by applying a safe operation (namely, `%`). + +[suggested here]: https://www.ralfj.de/blog/2018/08/22/two-kinds-of-invariants.html + +## Const safety check on values + +The result of any const computation (`const`, `static`, promoteds) is subject to +a "sanity check" which enforces const safety. (A sanity check is already +happening, but it is not exactly checking const safety currently.) Const safety +is defined as follows: + +* Integer and floating point types are const-safe if they are a `Scalar::Bits`. + This makes sure that we can run `%` and other operations without violating + const safety. In particular, the value must *not* be uninitialized. +* References are const-safe if they are `Scalar::Ptr` into allocated memory, and + the data stored there is const-safe. (Technically, we would also like to + require `&mut` to be unique and `&` to not be mutable unless there is an + `UnsafeCell`, but it seems infeasible to check that.) For fat pointers, the + length of a slice must be a valid `usize` and the vtable of a `dyn Trait` must + be a valid vtable. +* `bool` is const-safe if it is `Scalar::Bits` with a value of `0` or `1`. +* `char` is const-safe if it is a valid unicode codepoint. +* `()` is always const-safe. +* `!` is never const-safe. +* Tuples, structs, arrays and slices are const-safe if all their fields are + const-safe. +* Enums are const-safe if they have a valid discriminant and the fields of the + active variant are const-safe. +* Unions are always const-safe; the data does not matter. +* `dyn Trait` is const-safe if the value is const-safe at the type indicated by + the vtable. +* Function pointers are const-safe if they point to an actual function. A + `const fn` pointer (when/if we have those) must point to a `const fn`. + +For example: +```rust +static S: i32 = 0; +const BAD: usize = &S as *const i32 as usize; +``` +Here, `S` is const-safe because `0` is a `Scalar::Bits`. However, `BAD` is *not* const-safe because it is a `Scalar::Ptr`. + +## Const safety check on code + +The purpose of the const safety check on code is to prohibit construction of +non-const-safe values in safe code. We can allow *almost* all safe operations, +except for unconst operations -- which are all related to raw pointers: +Comparing raw pointers for (in)equality, converting them to integers, hashing +them (including hashing references) and so on must be prohibited. Basically, we +should not permit any raw pointer operations to begin with, and carefully +evaluate any that we permit to make sure they are fully supported by miri and do +not permit constructing non-const-safe values. + +There should also be a mechanism akin to `unsafe` blocks to opt-in to using +unconst operations. At this point, it becomes the responsibility of the +programmer to preserve const safety. In particular, a *safe* `const fn` must +always execute const-safely when called with const-safe arguments, and produce a +const-safe result. For example, the following function is const-safe (after +some extensions of the miri engine that are already implemented in miri) even +though it uses raw pointer operations: +```rust +const fn test_eq(x: &T, y: &T) -> bool { + x as *const _ == y as *const _ +} +``` +On the other hand, the following function is *not* const-safe and hence it is considered a bug to mark it as such: +``` +const fn convert(x: &T) -> usize { + x as *const _ as usize +} +``` + +## Open questions + +* Do we allow unconst operations in `unsafe` blocks, or do we have some other + mechanism for opting in to them (like `unconst` blocks)? + +* How do we communicate that the rules for safe `const fn` using unsafe code are + different than the ones for "runtime" functions? The good news here is that + violating the rules, at worst, leads to a compile-time error in a dependency. + No UB can arise. However, thanks to [promotion](promotion.md), compile-time + errors can arise even if no `const` or `static` is involved. diff --git a/promotion.md b/promotion.md index 7d0b312..069439d 100644 --- a/promotion.md +++ b/promotion.md @@ -1,20 +1,96 @@ # Const promotion +Note that promotion happens on the MIR, not on surface-level syntax. This is +relevant when discussing e.g. handling of panics caused by overflowing +arithmetic. + ## Rules -### 1. No side effects +### 1. Panics + +Promotion is not allowed to throw away side effects. This includes +panicking. Let us look at what happens when we promote `&(0_usize - 1)` in a +debug build: We have to avoid erroring at compile-time (because that would be +promotion breaking compilation), but we must be sure to error correctly at +run-time. In the MIR, this looks roughly like + +``` +_tmp1 = CheckedSub (const 0usize) (const 1usize) +assert(!_tmp1.1) -> [success: bb2; unwind: ..] + +bb2: +_tmp2 = tmp1.0 +_res = &_tmp2 +``` -Promotion is not allowed to throw away side effects. -This includes panicking. So in order to promote `&(0_usize - 1)`, -the subtraction is thrown away and only the panic is kept. +Both `_tmp1` and `_tmp2` are promoted to statics. `_tmp1` evaluates to `(~0, +true)`, so the assertion will always fail at run-time. Computing `_tmp2` fails +with a panic, which is thrown away -- so we have no result. In principle, we +could generate any code for this because we know the code is unreachable (the +assertion is going to fail). Just to be safe, we generate a call to +`llvm.trap`. + +As long as CTFE only panics when run-time code would also have panicked, this +works out correctly: The MIR already contains provisions for what to do on +panics (unwind edges etc.), so when CTFE panics we can generate code that +hard-codes a panic to happen at run-time. In other words, *promotion relies on +CTFE correctly implementing both normal program behavior and panics*. An +earlier version of miri used to panic on arithmetic overflow even in release +mode. This breaks promotion, because now promoting code that would work (and +could not panic!) at run-time leads to a compile-time CTFE error. ### 2. Const safety -Only const safe code gets promoted. This means that promotion doesn't happen -if the code does some action which, when run at compile time, either errors or -produces a value that differs from runtime. +We have explained what happens when evaluating a promoted panics, but what about +other kinds of failure -- what about hitting an unsupported operation or +undefined behavior? To make sure this does not happen, only const safe code +gets promoted. The exact details for `const safety` are discussed in +[here](const_safety.md). + +An example of this would be `&(&1 as *const i32 as usize % 16 == 0)`. The actual +location is not known at compile-time, so we cannot promote this. Generally, we +can guarantee const-safety by not promoting when an unsafe or unconst operation +is performed -- if our const safety checker is correct, that has to cover +everything, so the only possible remaining failure are panics. + +However, things get more tricky when `const` and `const fn` are involved. + +For `const`, based on the const safety check described [here](const_safety.md), +we can rely on there not being const-unsafe values in the `const`, so we should +be able to promote freely. For example: + +```rust +union Foo { x: &'static i32, y: usize } +const A: usize = unsafe { Foo { x: &1 }.y }; +const B: usize = unsafe { Foo { x: &2 }.y }; +let x: &bool = &(A < B); +``` + +Promoting `x` would lead to a compile failure because we cannot compare pointer +addresses. However, we do not even get there -- computing `A` or `B` fails with +a const safety check error because these are values of type `usize` that contain +a `Scalar::Ptr`. + +For `const fn`, however, there is no way to check anything in advance. We can +either just not promote, or we can move responsibility to the `const fn` and +promote *if* all function arguments pass the const safety check. So, +`foo(42usize)` would get promoted, but `foo(&1 as *const i32 as usize)` would +not. When this call panics, compilation proceeds and we just hard-code a panic +to happen as well at run-time. However, when const evaluation fails with +another error (unsupported operation or undefined behavior), we have no choice +but to abort compilation of a program that would have compiled fine if we would +not have decided to promote. It is the responsibility of `foo` to not fail this +way when working with const-safe arguments. + +### 3. Drop + +TODO: Fill this with information. + +### 4. Interior Mutability + +TODO: Fill this with information. -An example of this would be `&1 as *const i32 as usize == 42`. While it is highly -unlikely that the address of temporary value is `42`, at runtime this could be true. +## Open questions -The exact details for `const safety` are discussed in [here](const_safety.md). \ No newline at end of file +* There is a fourth kind of CTFE failure -- and endless loop being detected. + What do we do when that happens while evaluating a promoted?