Skip to content

Commit cf8749e

Browse files
authored
Fix provenance UB and alignment UB (#27)
Fix provenance UB and alignment UB These issues were unlikely to cause any miscompilations today, but it's best to fix them right away. Thanks to miri for finding these issues! Alignment: In the default configuration (**not** gecko-ffi), types with alignment equal to the header size (16 on x64; 8 on x86) were incorrectly handled by an optimization. This could result in misaligned empty slices being produced when accessing a 0-capacity ThinVec (which is produced by ThinVec::new()). Such a slice of course can't be used to load or store any memory, but Rust requires references (including slices) to always be aligned even if they aren't ever used to load/store memory. The destructor for ThinVec creates a slice to its elements, so this dropping any ThinVec::new() with such a highly aligned type was technically UB. That said, it's quite unlikely it could have lead to miscompilations in practice, since the compiler would be unable to "see" the misalignment and Rust doesn't currently do any runtime optimizations based on alignment (such as Option-style enum layout optimizations). Types with higher and lower alignments were handled fine, it was just this *specific* alignment that was taking the wrong path. The specific issue is that 0-cap ThinVecs all point to statically allocated empty singleton. This singleton is designed to Do The Right Thing without additional branches for many operations, meaning we get non-allocating ThinVec::new() *without* needing to compromise the performance of most other operations. One such "just work" situation is the data pointer, which will generally just be one-past-the-end of the empty singleton, which is in-bounds for the singleton. But for highly-aligned elements, the data pointer needs to be "padded" and that would go beyond the static singleton's "allocation". So in general we actually check if cap==0 and return NonNull::dangling for the data pointer. The *optimization* was to identify the cases where no such padding was required and statically eliminate the cap == 0 path. Unfortunately "has no padding" isn't sufficient: in the specific case of align==header_size, there is no padding but the singleton is underaligned! In this case the NonNull::dangling path must be taken. The code has been fixed and the optimization now works correctly for all alignments. Provenance: Many places ran afoul of Stacked Borrows. The issues found were: A &Header cannot be used to get a useful pointer to data beyond it, because the pointer from the as-cast of the &Header only has provenance over the Header. After a set_len call that decreases the length, it is invalid to create a slice then try to get_unchecked into the region between the old and new length, because the reference in the slice that the ThinVec now Derefs to does not have provenance over that region. Alternatively, this is UB because the docs stipulate that you're not allowed to use `get_unchecked` to index out of bounds. I think the use of align_offset in tests is subtly wrong, align_offset seems to be for optimizations only. The docs say that a valid implementation may always return usize::MAX.
1 parent 9705b3c commit cf8749e

File tree

1 file changed

+104
-36
lines changed

1 file changed

+104
-36
lines changed

src/lib.rs

Lines changed: 104 additions & 36 deletions
Original file line numberDiff line numberDiff line change
@@ -247,9 +247,22 @@ mod impl_details {
247247
}
248248
}
249249

250-
/// The header of a ThinVec.
251-
///
252-
/// The _cap can be a bitfield, so use accessors to avoid trouble.
250+
// The header of a ThinVec.
251+
//
252+
// The _cap can be a bitfield, so use accessors to avoid trouble.
253+
//
254+
// In "real" gecko-ffi mode, the empty singleton will be aligned
255+
// to 8 by gecko. But in tests we have to provide the singleton
256+
// ourselves, and Rust makes it hard to "just" align a static.
257+
// To avoid messing around with a wrapper type around the
258+
// singleton *just* for tests, we just force all headers to be
259+
// aligned to 8 in this weird "zombie" gecko mode.
260+
//
261+
// This shouldn't affect runtime layout (padding), but it will
262+
// result in us asking the allocator to needlessly overalign
263+
// non-empty ThinVecs containing align < 8 types in
264+
// zombie-mode, but not in "real" geck-ffi mode. Minor.
265+
#[cfg_attr(all(feature = "gecko-ffi", any(test, miri)), repr(align(8)))]
253266
#[repr(C)]
254267
struct Header {
255268
_len: SizeType,
@@ -264,23 +277,6 @@ impl Header {
264277
fn set_len(&mut self, len: usize) {
265278
self._len = assert_size(len);
266279
}
267-
268-
fn data<T>(&self) -> *mut T {
269-
let header_size = mem::size_of::<Header>();
270-
let padding = padding::<T>();
271-
272-
let ptr = self as *const Header as *mut Header as *mut u8;
273-
274-
unsafe {
275-
if padding > 0 && self.cap() == 0 {
276-
// The empty header isn't well-aligned, just make an aligned one up
277-
NonNull::dangling().as_ptr()
278-
} else {
279-
// This could technically result in overflow, but padding would have to be absurdly large for this to occur.
280-
ptr.add(header_size + padding) as *mut T
281-
}
282-
}
283-
}
284280
}
285281

286282
#[cfg(feature = "gecko-ffi")]
@@ -321,10 +317,10 @@ impl Header {
321317
/// optimize everything to not do that (basically, make ptr == len and branch
322318
/// on size == 0 in every method), but it's a bunch of work for something that
323319
/// doesn't matter much.
324-
#[cfg(any(not(feature = "gecko-ffi"), test))]
320+
#[cfg(any(not(feature = "gecko-ffi"), test, miri))]
325321
static EMPTY_HEADER: Header = Header { _len: 0, _cap: 0 };
326322

327-
#[cfg(all(feature = "gecko-ffi", not(test)))]
323+
#[cfg(all(feature = "gecko-ffi", not(test), not(miri)))]
328324
extern "C" {
329325
#[link_name = "sEmptyTArrayHeader"]
330326
static EMPTY_HEADER: Header;
@@ -442,17 +438,26 @@ macro_rules! thin_vec {
442438

443439
impl<T> ThinVec<T> {
444440
pub fn new() -> ThinVec<T> {
445-
unsafe {
446-
ThinVec {
447-
ptr: NonNull::new_unchecked(&EMPTY_HEADER as *const Header as *mut Header),
448-
boo: PhantomData,
449-
}
450-
}
441+
ThinVec::with_capacity(0)
451442
}
452443

453444
pub fn with_capacity(cap: usize) -> ThinVec<T> {
445+
// `padding` contains ~static assertions against types that are
446+
// incompatible with the current feature flags. We also call it to
447+
// invoke these assertions when getting a pointer to the `ThinVec`
448+
// contents, but since we also get a pointer to the contents in the
449+
// `Drop` impl, trippng an assertion along that code path causes a
450+
// double panic. We duplicate the assertion here so that it is
451+
// testable,
452+
let _ = padding::<T>();
453+
454454
if cap == 0 {
455-
ThinVec::new()
455+
unsafe {
456+
ThinVec {
457+
ptr: NonNull::new_unchecked(&EMPTY_HEADER as *const Header as *mut Header),
458+
boo: PhantomData,
459+
}
460+
}
456461
} else {
457462
ThinVec {
458463
ptr: header_with_capacity::<T>(cap),
@@ -470,7 +475,47 @@ impl<T> ThinVec<T> {
470475
unsafe { self.ptr.as_ref() }
471476
}
472477
fn data_raw(&self) -> *mut T {
473-
self.header().data()
478+
// `padding` contains ~static assertions against types that are
479+
// incompatible with the current feature flags. Even if we don't
480+
// care about its result, we should always call it before getting
481+
// a data pointer to guard against invalid types!
482+
let padding = padding::<T>();
483+
484+
// Although we ensure the data array is aligned when we allocate,
485+
// we can't do that with the empty singleton. So when it might not
486+
// be properly aligned, we substitute in the NonNull::dangling
487+
// which *is* aligned.
488+
//
489+
// To minimize dynamic branches on `cap` for all accesses
490+
// to the data, we include this guard which should only involve
491+
// compile-time constants. Ideally this should result in the branch
492+
// only be included for types with excessive alignment.
493+
let empty_header_is_aligned = if cfg!(feature = "gecko-ffi") {
494+
// in gecko-ffi mode `padding` will ensure this under
495+
// the assumption that the header has size 8 and the
496+
// static empty singleton is aligned to 8.
497+
true
498+
} else {
499+
// In non-gecko-ffi mode, the empty singleton is just
500+
// naturally aligned to the Header. If the Header is at
501+
// least as aligned as T *and* the padding would have
502+
// been 0, then one-past-the-end of the empty singleton
503+
// *is* a valid data pointer and we can remove the
504+
// `dangling` special case.
505+
mem::align_of::<Header>() >= mem::align_of::<T>() && padding == 0
506+
};
507+
508+
unsafe {
509+
if !empty_header_is_aligned && self.header().cap() == 0 {
510+
NonNull::dangling().as_ptr()
511+
} else {
512+
// This could technically result in overflow, but padding
513+
// would have to be absurdly large for this to occur.
514+
let header_size = mem::size_of::<Header>();
515+
let ptr = self.ptr.as_ptr() as *mut u8;
516+
ptr.add(header_size + padding) as *mut T
517+
}
518+
}
474519
}
475520

476521
// This is unsafe when the header is EMPTY_HEADER.
@@ -565,7 +610,7 @@ impl<T> ThinVec<T> {
565610
// doesn't re-drop the just-failed value.
566611
let new_len = self.len() - 1;
567612
self.set_len(new_len);
568-
ptr::drop_in_place(self.get_unchecked_mut(new_len));
613+
ptr::drop_in_place(self.data_raw().add(new_len));
569614
}
570615
}
571616
}
@@ -915,7 +960,7 @@ impl<T> ThinVec<T> {
915960
(*ptr).set_cap(new_cap);
916961
self.ptr = NonNull::new_unchecked(ptr);
917962
} else {
918-
let mut new_header = header_with_capacity::<T>(new_cap);
963+
let new_header = header_with_capacity::<T>(new_cap);
919964

920965
// If we get here and have a non-zero len, then we must be handling
921966
// a gecko auto array, and we have items in a stack buffer. We shouldn't
@@ -931,8 +976,9 @@ impl<T> ThinVec<T> {
931976
let len = self.len();
932977
if cfg!(feature = "gecko-ffi") && len > 0 {
933978
new_header
934-
.as_mut()
935-
.data::<T>()
979+
.as_ptr()
980+
.add(1)
981+
.cast::<T>()
936982
.copy_from_nonoverlapping(self.data_raw(), len);
937983
self.set_len(0);
938984
}
@@ -1373,6 +1419,28 @@ mod tests {
13731419
ThinVec::<u8>::new();
13741420
}
13751421

1422+
#[test]
1423+
fn test_data_ptr_alignment() {
1424+
let v = ThinVec::<u16>::new();
1425+
assert!(v.data_raw() as usize % 2 == 0);
1426+
1427+
let v = ThinVec::<u32>::new();
1428+
assert!(v.data_raw() as usize % 4 == 0);
1429+
1430+
let v = ThinVec::<u64>::new();
1431+
assert!(v.data_raw() as usize % 8 == 0);
1432+
}
1433+
1434+
#[test]
1435+
#[cfg_attr(feature = "gecko-ffi", should_panic)]
1436+
fn test_overaligned_type_is_rejected_for_gecko_ffi_mode() {
1437+
#[repr(align(16))]
1438+
struct Align16(u8);
1439+
1440+
let v = ThinVec::<Align16>::new();
1441+
assert!(v.data_raw() as usize % 16 == 0);
1442+
}
1443+
13761444
#[test]
13771445
fn test_partial_eq() {
13781446
assert_eq!(thin_vec![0], thin_vec![0]);
@@ -2654,9 +2722,9 @@ mod std_tests {
26542722
macro_rules! assert_aligned_head_ptr {
26552723
($typename:ty) => {{
26562724
let v: ThinVec<$typename> = ThinVec::with_capacity(1 /* ensure allocation */);
2657-
let head_ptr: *mut $typename = v.header().data::<$typename>();
2725+
let head_ptr: *mut $typename = v.data_raw();
26582726
assert_eq!(
2659-
head_ptr.align_offset(std::mem::align_of::<$typename>()),
2727+
head_ptr as usize % std::mem::align_of::<$typename>(),
26602728
0,
26612729
"expected Header::data<{}> to be aligned",
26622730
stringify!($typename)

0 commit comments

Comments
 (0)