Skip to content

Commit 57f570b

Browse files
committed
clarify that the str invariant is a safety, not validity, invariant
1 parent 2429818 commit 57f570b

File tree

1 file changed

+13
-6
lines changed

1 file changed

+13
-6
lines changed

library/core/src/primitive_docs.rs

+13-6
Original file line numberDiff line numberDiff line change
@@ -291,7 +291,7 @@ mod prim_never {}
291291
/// Surrogate code points, used by UTF-16, are in the range 0xD800 to 0xDFFF.
292292
///
293293
/// No `char` may be constructed, whether as a literal or at runtime, that is not a
294-
/// Unicode scalar value:
294+
/// Unicode scalar value. Violating this rule causes Undefined Behavior.
295295
///
296296
/// ```compile_fail
297297
/// // Each of these is a compiler error
@@ -308,9 +308,10 @@ mod prim_never {}
308308
/// let _ = unsafe { char::from_u32_unchecked(0x110000) };
309309
/// ```
310310
///
311-
/// USVs are also the exact set of values that may be encoded in UTF-8. Because
312-
/// `char` values are USVs and `str` values are valid UTF-8, it is safe to store
313-
/// any `char` in a `str` or read any character from a `str` as a `char`.
311+
/// USVs are also the exact set of values that may be encoded in UTF-8. Because `char` values are
312+
/// USVs and functions may assume [incoming `str` values are valid
313+
/// UTF-8](primitive.str.html#invariant), it is safe to store any `char` in a `str` or read any
314+
/// character from a `str` as a `char`.
314315
///
315316
/// The gap in valid `char` values is understood by the compiler, so in the
316317
/// below example the two ranges are understood to cover the whole range of
@@ -887,8 +888,6 @@ mod prim_slice {}
887888
/// type. It is usually seen in its borrowed form, `&str`. It is also the type
888889
/// of string literals, `&'static str`.
889890
///
890-
/// String slices are always valid UTF-8.
891-
///
892891
/// # Basic Usage
893892
///
894893
/// String literals are string slices:
@@ -942,6 +941,14 @@ mod prim_slice {}
942941
/// Note: This example shows the internals of `&str`. `unsafe` should not be
943942
/// used to get a string slice under normal circumstances. Use `as_str`
944943
/// instead.
944+
///
945+
/// # Invariant
946+
///
947+
/// Rust libraries may assume that string slices are always valid UTF-8.
948+
///
949+
/// Constructing a non-UTF-8 string slice is not immediate Undefined Behavior, but any function
950+
/// called on a string slice may assume that it is valid UTF-8, which means that a non-UTF-8 string
951+
/// slice can lead to Undefined Behaviior down the road.
945952
#[stable(feature = "rust1", since = "1.0.0")]
946953
mod prim_str {}
947954

0 commit comments

Comments
 (0)