Add section on uninitialized memory

Manishearth · Manishearth · commit 0f9bfe738625 · 2023-03-04T11:55:25.000-08:00
diff --git a/src/SUMMARY.md b/src/SUMMARY.md
@@ -4,17 +4,17 @@
 - [Undefined behavior](./undefined_behavior.md)
 - [Core unsafety](./core_unsafety.md)
     - [Dangling and unaligned pointers](./core_unsafety/dangling_and_unaligned_pointers.md)
+    - [Invalid values](./core_unsafety/invalid_values.md)
     - [Data races](./core_unsafety/data_races.md)
     - [Intrinsics](./core_unsafety/intrinsics.md)
     - [ABI and FFI](./core_unsafety/abi_and_ffi.md)
     - [Platform features](./core_unsafety/platform_features.md)
     - [Inline assembly](./core_unsafety/inline_assembly.md)
 - [Advanced unsafety](./advanced_unsafety.md)
-    - [Invalid values](./core_unsafety/invalid_values.md)
+    - [Uninitialized memory](./advanced_unsafety/uninitialized.md)
     - [Pointer aliasing](./advanced_unsafety/pointer_aliasing.md)
     - [Immutable data](./advanced_unsafety/immutable_data.md)
     - [Atomic ordering](./advanced_unsafety/atomic_ordering.md)
-    - [Undef memory](./advanced_unsafety/undef_memory.md)
     - [Pinning](./advanced_unsafety/pinning.md)
     - [Variance](./advanced_unsafety/variance.md)
 - [Expert unsafety](./expert_unsafety.md)
diff --git a/src/advanced_unsafety/invalid_values.md b/src/advanced_unsafety/invalid_values.md
@@ -129,7 +129,7 @@ This is not an exhaustive list: ultimately, having an invalid value is UB and it
 
 
  [unaligned]: ../core_unsafety/dangling_and_unaligned_pointers.md
- [uninit-chapter]: ../undef_memory.md
+ [uninit-chapter]: ../advanced_unsafe/uninitialized.md
  [`mem::transmute()`]: https://doc.rust-lang.org/stable/std/mem/fn.transmute.html
  [`mem::transmute_copy()`]: https://doc.rust-lang.org/stable/std/mem/fn.transmute_copy.html
  [`mem::zeroed()`]: https://doc.rust-lang.org/stable/std/mem/fn.zeroed.html
diff --git a/src/advanced_unsafety/undef_memory.md b/src/advanced_unsafety/undef_memory.md
diff --git a/src/advanced_unsafety/uninitialized.md b/src/advanced_unsafety/uninitialized.md
@@ -0,0 +1,99 @@
+# Uninitialized memory
+
+> _"I'm Nobody! Who are you? Are you — Nobody — too?"_
+>
+> — _Emily Dickinson_
+
+While we have covered [invalid values], there's another thing that behaves a lot like invalid values, but has nothing to do with actual bit patterns: Uninitialized memory.
+
+An easy way to think about uninitialized memory is that there's an additional value (often called `undef` using LLVM's term for it) that does not map to any concrete bit pattern, but can be introduced in abstract in various ways, and makes _most_ values invalid.
+
+If you explicitly wish to work with uninitialized and partially-initialized types, [`MaybeUninit<T>`] is a useful abstraction since it can be "initialized" with no overhead and then written to in parts.
+
+## Sources of uninitialized memory
+
+### `mem::uninitialized()` and `MaybeUninit::assume_init`
+
+[`mem::uninitialized()`] is a deprecated API that has a very tempting shape, it lets you do things like `let x = mem::uninitialized()` for cases when you want to construct the value in bits. It's basically _always_ UB to use, since it immediately sets `x` to uninitialized memory, which is UB.
+
+Use [`MaybeUninit<T>`] instead.
+
+It is still possible to create uninitialized memory using [`MaybeUninit::assume_init()`] if you have not, in fact, assured that things are initialized.
+
+### Padding
+
+Padding bytes in structs and enums are often but not always uninitialized. This means that treating a struct as a bag of bytes (by, say, treating `&Struct` as `&[u8; size_of::<Struct>()]` and reading from there) is UB even if you don't write invalid values to those bytes, since you are ginning up uninitialized `u8`s.
+
+Reading from padding [always produces uninitialized values][pad-glossary].
+
+
+
+### Moved-from values
+
+The following code is UB:
+
+```rust
+let x = Foo::new(); // Foo is not Copy
+let mut v = vec![];
+let ptr = &x as *const Foo;
+
+v.push(x); // move x into the vector
+
+unsafe {
+    // reads from moved-from memory
+    let ghost = ptr::read(x);
+}
+```
+
+Any type of move will do this, even when you "move" the value into a different variable with stuff like `let y = x;`.
+
+Note that Rust does let you "partially move" out of fields of a struct, in such a case the whole struct is now no longer a valid value for its type, but you are still allowed to "use" the struct to look at other fields. When doing such things, make sure there are no pointers that still think the struct is whole and valid.
+
+#### APIs that are not moves: `ptr::drop_in_place()`, `ManuallyDrop::drop()`, and `ptr::read()`
+
+[`ptr::drop_in_place()`] and [`ManuallyDrop::drop()`] are interesting: they call all the destructor[^1] on a value (or a pointed-to value in the case of the former). From a safety point of view they are identical; they are just different APIs for dealing with manually calling drop glue.
+
+[`ManuallyDrop::drop()`] makes the following claim:
+
+> Other than changes made by the destructor itself, the memory is left unchanged, and so as far as the compiler is concerned still holds a bit-pattern which is valid for the type T.
+
+In other words, Rust does _not_ consider these operations to do the same invalidation as a regular "move from" operation, even though they have a similar feel.
+
+There is an [open issue][ugc-394] about whether `Drop::drop()` is itself allowed to produce uninitialized or invalid memory, so it may not be possible to rely on this in a generic context.
+
+[`ptr::read()`] similarly claims that it leaves the source memory untouched, which means that it is still a valid value.
+
+
+For all of these APIs, actually _using_ the dropped or read-from memory may still be fraught depending on the invariants of the value; it's quite easy to cause a double-free by materializing an owned value from the original data after it has already been read-from or dropped.
+
+However, they do not produce uninitialized memory.
+
+
+### Freshly allocated memory
+
+Freshly allocated memory (e.g. the yet-unused bytes in [`Vec::with_capacity()`] or just the result of [`Allocator::allocate()`]) is usually uninitialized. You can use APIs like [`Allocator::zeroed()`] if you wish to avoid this, though you can still end up making [invalid values] the same way you can with [`mem::zeroed()`].
+
+Generally after allocating memory one should make sure that the only part of that memory being read from is known to have been written to. This can be tricky in situations around @@@
+
+## When you might end up making an uninitialized value
+
+## Things you might see if you made an uninitialized value
+
+
+ [invalid values]: ../core_unsafety/invalid_values.md
+ [`mem::uninitialized()`]: https://doc.rust-lang.org/stable/std/mem/fn.uninitialized.html
+ [`mem::zeroed()`]: https://doc.rust-lang.org/stable/std/mem/fn.zeroed.html
+ [`MaybeUninit<T>`]: https://doc.rust-lang.org/stable/std/mem/union.MaybeUninit.html
+ [`MaybeUninit::assume_init()`]: https://doc.rust-lang.org/stable/std/mem/union.MaybeUninit.html#method.assume_init
+ [pad-glossary]: https://github.com/rust-lang/unsafe-code-guidelines/blob/master/reference/src/glossary.md#padding
+ [`ptr::drop_in_place()`]: https://doc.rust-lang.org/stable/std/ptr/fn.drop_in_place.html
+ [`ManuallyDrop::drop()`]: https://doc.rust-lang.org/stable/std/mem/struct.ManuallyDrop.html#method.drop
+ [`ptr::read()`]: https://doc.rust-lang.org/stable/std/ptr/fn.read.html
+ [ugc-394]: https://github.com/rust-lang/unsafe-code-guidelines/issues/394
+ [`Vec::with_capacity()`]: https://doc.rust-lang.org/stable/std/vec/struct.Vec.html#method.with_capacity
+ [`Allocator::allocate()`]: https://doc.rust-lang.org/stable/std/alloc/trait.Allocator.html#tymethod.allocate
+ [`Allocator::zeroed()`]: https://doc.rust-lang.org/stable/std/alloc/trait.Allocator.html#method.allocate_zeroed
+
+
+ [^1]: The "destructor" is different from the `Drop` trait. Calling the destructor is the process of calling a type's `Drop::drop` impl if it exists, and then calling the destructor for all of its fields (also known as "drop glue"). I.e. it's not _just_ `Drop`, but rather the entire _destruction_, of which the destructor is one part. Types that do not implement `Drop` may still have contentful destructors if their transitive fields do.
+