Skip to content

Introduce unsafe offset_from on pointers #49297

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Mar 26, 2018
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 10 additions & 0 deletions src/libcore/intrinsics.rs
Original file line number Diff line number Diff line change
Expand Up @@ -1314,6 +1314,11 @@ extern "rust-intrinsic" {
/// [`std::u32::overflowing_mul`](../../std/primitive.u32.html#method.overflowing_mul)
pub fn mul_with_overflow<T>(x: T, y: T) -> (T, bool);

/// Performs an exact division, resulting in undefined behavior where
/// `x % y != 0` or `y == 0` or `x == T::min_value() && y == -1`
#[cfg(not(stage0))]
pub fn exact_div<T>(x: T, y: T) -> T;

/// Performs an unchecked division, resulting in undefined behavior
/// where y = 0 or x = `T::min_value()` and y = -1
pub fn unchecked_div<T>(x: T, y: T) -> T;
Expand Down Expand Up @@ -1396,3 +1401,8 @@ extern "rust-intrinsic" {
/// Probably will never become stable.
pub fn nontemporal_store<T>(ptr: *mut T, val: T);
}

#[cfg(stage0)]
pub unsafe fn exact_div<T>(a: T, b: T) -> T {
unchecked_div(a, b)
}
207 changes: 207 additions & 0 deletions src/libcore/ptr.rs
Original file line number Diff line number Diff line change
Expand Up @@ -700,6 +700,114 @@ impl<T: ?Sized> *const T {
}
}

/// Calculates the distance between two pointers. The returned value is in
/// units of T: the distance in bytes is divided by `mem::size_of::<T>()`.
///
/// This function is the inverse of [`offset`].
///
/// [`offset`]: #method.offset
/// [`wrapping_offset_from`]: #method.wrapping_offset_from
///
/// # Safety
///
/// If any of the following conditions are violated, the result is Undefined
/// Behavior:
///
/// * Both the starting and other pointer must be either in bounds or one
/// byte past the end of the same allocated object.
///
/// * The distance between the pointers, **in bytes**, cannot overflow an `isize`.
///
/// * The distance between the pointers, in bytes, must be an exact multiple
/// of the size of `T` and `T` must not be a Zero-Sized Type ("ZST").
///
/// * The distance being in bounds cannot rely on "wrapping around" the address space.
///
/// The compiler and standard library generally try to ensure allocations
/// never reach a size where an offset is a concern. For instance, `Vec`
/// and `Box` ensure they never allocate more than `isize::MAX` bytes, so
/// `ptr_into_vec.offset_from(vec.as_ptr())` is always safe.
///
/// Most platforms fundamentally can't even construct such an allocation.
/// For instance, no known 64-bit platform can ever serve a request
/// for 2<sup>63</sup> bytes due to page-table limitations or splitting the address space.
/// However, some 32-bit and 16-bit platforms may successfully serve a request for
/// more than `isize::MAX` bytes with things like Physical Address
/// Extension. As such, memory acquired directly from allocators or memory
/// mapped files *may* be too large to handle with this function.
///
/// Consider using [`wrapping_offset_from`] instead if these constraints are
/// difficult to satisfy. The only advantage of this method is that it
/// enables more aggressive compiler optimizations.
///
/// # Examples
///
/// Basic usage:
///
/// ```
/// #![feature(ptr_offset_from)]
///
/// let a = [0; 5];
/// let ptr1: *const i32 = &a[1];
/// let ptr2: *const i32 = &a[3];
/// unsafe {
/// assert_eq!(ptr2.offset_from(ptr1), 2);
/// assert_eq!(ptr1.offset_from(ptr2), -2);
/// assert_eq!(ptr1.offset(2), ptr2);
/// assert_eq!(ptr2.offset(-2), ptr1);
/// }
/// ```
#[unstable(feature = "ptr_offset_from", issue = "41079")]
#[inline]
pub unsafe fn offset_from(self, other: *const T) -> isize where T: Sized {
let pointee_size = mem::size_of::<T>();
assert!(0 < pointee_size && pointee_size <= isize::max_value() as usize);

// FIXME: can this be nuw/nsw?
let d = isize::wrapping_sub(self as _, other as _);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It can't be nuw because the result might be a negative value.

It can't be nsw either because an allocation may straddle the boundary between ISIZE_MAX (0x7fffffff) and ISIZE_MIN (0x80000000).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. nuw and nsw force both the result and the operands to have the same interpretation as each other. What you'd really want here is a way to say that the operands have an unsigned interpretation, while the result has a signed interpretation, however there's currently no way to express that in LLVM.

intrinsics::exact_div(d, pointee_size as _)
}

/// Calculates the distance between two pointers. The returned value is in
/// units of T: the distance in bytes is divided by `mem::size_of::<T>()`.
///
/// If the address different between the two pointers is not a multiple of
/// `mem::size_of::<T>()` then the result of the division is rounded towards
/// zero.
///
/// # Panics
///
/// This function panics if `T` is a zero-sized typed.
///
/// # Examples
///
/// Basic usage:
///
/// ```
/// #![feature(ptr_wrapping_offset_from)]
///
/// let a = [0; 5];
/// let ptr1: *const i32 = &a[1];
/// let ptr2: *const i32 = &a[3];
Copy link
Contributor

@strega-nil strega-nil Mar 23, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will panic at runtime. You do this a few other places as well. You should probably use get_unchecked? There's no actual mention of UB if you index out of bounds, so, I don't think it's an issue.

Copy link
Contributor

@hanna-kruppe hanna-kruppe Mar 23, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Huh? a is an array of length 5, so a[3] is not out of bounds.

(But if it was, get_unchecked would be UB because it uses offset, not wrapping_offset.)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I, uh, may be too used to OCaml, sorry -.-

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, offset works with pointers which are one past the end.

Copy link
Contributor

@hanna-kruppe hanna-kruppe Mar 23, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh yeah if a had length 2 then that would have been fine. Edit: Uh, yeah, see below

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I'm just being stupid, array indexing is zero-based lol. Sorry, I should not comment when I'm tired :P

/// assert_eq!(ptr2.wrapping_offset_from(ptr1), 2);
/// assert_eq!(ptr1.wrapping_offset_from(ptr2), -2);
/// assert_eq!(ptr1.wrapping_offset(2), ptr2);
/// assert_eq!(ptr2.wrapping_offset(-2), ptr1);
///
/// let ptr1: *const i32 = 3 as _;
/// let ptr2: *const i32 = 13 as _;
/// assert_eq!(ptr2.wrapping_offset_from(ptr1), 2);
/// ```
#[unstable(feature = "ptr_wrapping_offset_from", issue = "41079")]
#[inline]
pub fn wrapping_offset_from(self, other: *const T) -> isize where T: Sized {
let pointee_size = mem::size_of::<T>();
assert!(0 < pointee_size && pointee_size <= isize::max_value() as usize);

let d = isize::wrapping_sub(self as _, other as _);
d.wrapping_div(pointee_size as _)
}

/// Calculates the offset from a pointer (convenience for `.offset(count as isize)`).
///
/// `count` is in units of T; e.g. a `count` of 3 represents a pointer
Expand Down Expand Up @@ -1347,6 +1455,105 @@ impl<T: ?Sized> *mut T {
}
}

/// Calculates the distance between two pointers. The returned value is in
/// units of T: the distance in bytes is divided by `mem::size_of::<T>()`.
///
/// This function is the inverse of [`offset`].
///
/// [`offset`]: #method.offset-1
/// [`wrapping_offset_from`]: #method.wrapping_offset_from-1
///
/// # Safety
///
/// If any of the following conditions are violated, the result is Undefined
/// Behavior:
///
/// * Both the starting and other pointer must be either in bounds or one
/// byte past the end of the same allocated object.
///
/// * The distance between the pointers, **in bytes**, cannot overflow an `isize`.
///
/// * The distance between the pointers, in bytes, must be an exact multiple
/// of the size of `T` and `T` must not be a Zero-Sized Type ("ZST").
///
/// * The distance being in bounds cannot rely on "wrapping around" the address space.
///
/// The compiler and standard library generally try to ensure allocations
/// never reach a size where an offset is a concern. For instance, `Vec`
/// and `Box` ensure they never allocate more than `isize::MAX` bytes, so
/// `ptr_into_vec.offset_from(vec.as_ptr())` is always safe.
///
/// Most platforms fundamentally can't even construct such an allocation.
/// For instance, no known 64-bit platform can ever serve a request
/// for 2<sup>63</sup> bytes due to page-table limitations or splitting the address space.
/// However, some 32-bit and 16-bit platforms may successfully serve a request for
/// more than `isize::MAX` bytes with things like Physical Address
/// Extension. As such, memory acquired directly from allocators or memory
/// mapped files *may* be too large to handle with this function.
///
/// Consider using [`wrapping_offset_from`] instead if these constraints are
/// difficult to satisfy. The only advantage of this method is that it
/// enables more aggressive compiler optimizations.
///
/// # Examples
///
/// Basic usage:
///
/// ```
/// #![feature(ptr_offset_from)]
///
/// let a = [0; 5];
/// let ptr1: *mut i32 = &mut a[1];
/// let ptr2: *mut i32 = &mut a[3];
/// unsafe {
/// assert_eq!(ptr2.offset_from(ptr1), 2);
/// assert_eq!(ptr1.offset_from(ptr2), -2);
/// assert_eq!(ptr1.offset(2), ptr2);
/// assert_eq!(ptr2.offset(-2), ptr1);
/// }
/// ```
#[unstable(feature = "ptr_offset_from", issue = "41079")]
#[inline]
pub unsafe fn offset_from(self, other: *const T) -> isize where T: Sized {
(self as *const T).offset_from(other)
}

/// Calculates the distance between two pointers. The returned value is in
/// units of T: the distance in bytes is divided by `mem::size_of::<T>()`.
///
/// If the address different between the two pointers is not a multiple of
/// `mem::size_of::<T>()` then the result of the division is rounded towards
/// zero.
///
/// # Panics
///
/// This function panics if `T` is a zero-sized typed.
///
/// # Examples
///
/// Basic usage:
///
/// ```
/// #![feature(ptr_wrapping_offset_from)]
///
/// let a = [0; 5];
/// let ptr1: *mut i32 = &mut a[1];
/// let ptr2: *mut i32 = &mut a[3];
/// assert_eq!(ptr2.wrapping_offset_from(ptr1), 2);
/// assert_eq!(ptr1.wrapping_offset_from(ptr2), -2);
/// assert_eq!(ptr1.wrapping_offset(2), ptr2);
/// assert_eq!(ptr2.wrapping_offset(-2), ptr1);
///
/// let ptr1: *mut i32 = 3 as _;
/// let ptr2: *mut i32 = 13 as _;
/// assert_eq!(ptr2.wrapping_offset_from(ptr1), 2);
/// ```
#[unstable(feature = "ptr_wrapping_offset_from", issue = "41079")]
#[inline]
pub fn wrapping_offset_from(self, other: *const T) -> isize where T: Sized {
(self as *const T).wrapping_offset_from(other)
}

/// Computes the byte offset that needs to be applied in order to
/// make the pointer aligned to `align`.
/// If it is not possible to align the pointer, the implementation returns
Expand Down
5 changes: 5 additions & 0 deletions src/librustc_llvm/ffi.rs
Original file line number Diff line number Diff line change
Expand Up @@ -935,6 +935,11 @@ extern "C" {
RHS: ValueRef,
Name: *const c_char)
-> ValueRef;
pub fn LLVMBuildExactUDiv(B: BuilderRef,
LHS: ValueRef,
RHS: ValueRef,
Name: *const c_char)
-> ValueRef;
pub fn LLVMBuildSDiv(B: BuilderRef,
LHS: ValueRef,
RHS: ValueRef,
Expand Down
7 changes: 7 additions & 0 deletions src/librustc_trans/builder.rs
Original file line number Diff line number Diff line change
Expand Up @@ -344,6 +344,13 @@ impl<'a, 'tcx> Builder<'a, 'tcx> {
}
}

pub fn exactudiv(&self, lhs: ValueRef, rhs: ValueRef) -> ValueRef {
self.count_insn("exactudiv");
unsafe {
llvm::LLVMBuildExactUDiv(self.llbuilder, lhs, rhs, noname())
}
}

pub fn sdiv(&self, lhs: ValueRef, rhs: ValueRef) -> ValueRef {
self.count_insn("sdiv");
unsafe {
Expand Down
8 changes: 7 additions & 1 deletion src/librustc_trans/intrinsic.rs
Original file line number Diff line number Diff line change
Expand Up @@ -289,7 +289,7 @@ pub fn trans_intrinsic_call<'a, 'tcx>(bx: &Builder<'a, 'tcx>,
"ctlz" | "ctlz_nonzero" | "cttz" | "cttz_nonzero" | "ctpop" | "bswap" |
"bitreverse" | "add_with_overflow" | "sub_with_overflow" |
"mul_with_overflow" | "overflowing_add" | "overflowing_sub" | "overflowing_mul" |
"unchecked_div" | "unchecked_rem" | "unchecked_shl" | "unchecked_shr" => {
"unchecked_div" | "unchecked_rem" | "unchecked_shl" | "unchecked_shr" | "exact_div" => {
let ty = arg_tys[0];
match int_type_width_signed(ty, cx) {
Some((width, signed)) =>
Expand Down Expand Up @@ -343,6 +343,12 @@ pub fn trans_intrinsic_call<'a, 'tcx>(bx: &Builder<'a, 'tcx>,
"overflowing_add" => bx.add(args[0].immediate(), args[1].immediate()),
"overflowing_sub" => bx.sub(args[0].immediate(), args[1].immediate()),
"overflowing_mul" => bx.mul(args[0].immediate(), args[1].immediate()),
"exact_div" =>
if signed {
bx.exactsdiv(args[0].immediate(), args[1].immediate())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

exactsdiv seems to be missing?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

} else {
bx.exactudiv(args[0].immediate(), args[1].immediate())
},
"unchecked_div" =>
if signed {
bx.sdiv(args[0].immediate(), args[1].immediate())
Expand Down
2 changes: 1 addition & 1 deletion src/librustc_typeck/check/intrinsic.rs
Original file line number Diff line number Diff line change
Expand Up @@ -283,7 +283,7 @@ pub fn check_intrinsic_type<'a, 'tcx>(tcx: TyCtxt<'a, 'tcx, 'tcx>,
(1, vec![param(0), param(0)],
tcx.intern_tup(&[param(0), tcx.types.bool])),

"unchecked_div" | "unchecked_rem" =>
"unchecked_div" | "unchecked_rem" | "exact_div" =>
(1, vec![param(0), param(0)], param(0)),
"unchecked_shl" | "unchecked_shr" =>
(1, vec![param(0), param(0)], param(0)),
Expand Down