Skip to content

Add safe wrapper for atomic_compilerfence intrinsics #41092

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Apr 9, 2017
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions src/doc/unstable-book/src/SUMMARY.md
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,7 @@
- [collections](collections.md)
- [collections_range](collections-range.md)
- [command_envs](command-envs.md)
- [compiler_barriers](compiler-barriers.md)
- [compiler_builtins](compiler-builtins.md)
- [compiler_builtins_lib](compiler-builtins-lib.md)
- [concat_idents](concat-idents.md)
Expand Down
98 changes: 98 additions & 0 deletions src/doc/unstable-book/src/compiler-barriers.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,98 @@
# `compiler_barriers`

The tracking issue for this feature is: [#41092]

[#41092]: https://github.com/rust-lang/rust/issues/41092

------------------------

The `compiler_barriers` feature exposes the `compiler_barrier` function
in `std::sync::atomic`. This function is conceptually similar to C++'s
`atomic_signal_fence`, which can currently only be accessed in nightly
Rust using the `atomic_singlethreadfence_*` instrinsic functions in
`core`, or through the mostly equivalent literal assembly:

```rust
#![feature(asm)]
unsafe { asm!("" ::: "memory" : "volatile") };
```

A `compiler_barrier` restricts the kinds of memory re-ordering the
compiler is allowed to do. Specifically, depending on the given ordering
semantics, the compiler may be disallowed from moving reads or writes
from before or after the call to the other side of the call to
`compiler_barrier`.

## Examples

The need to prevent re-ordering of reads and writes often arises when
working with low-level devices. Consider a piece of code that interacts
with an ethernet card with a set of internal registers that are accessed
through an address port register (`a: &mut usize`) and a data port
register (`d: &usize`). To read internal register 5, the following code
might then be used:

```rust
fn read_fifth(a: &mut usize, d: &usize) -> usize {
*a = 5;
*d
}
```

In this case, the compiler is free to re-order these two statements if
it thinks doing so might result in better performance, register use, or
anything else compilers care about. However, in doing so, it would break
the code, as `x` would be set to the value of some other device
register!

By inserting a compiler barrier, we can force the compiler to not
re-arrange these two statements, making the code function correctly
again:

```rust
#![feature(compiler_barriers)]
use std::sync::atomic;

fn read_fifth(a: &mut usize, d: &usize) -> usize {
*a = 5;
atomic::compiler_barrier(atomic::Ordering::SeqCst);
*d
}
```

Compiler barriers are also useful in code that implements low-level
synchronization primitives. Consider a structure with two different
atomic variables, with a dependency chain between them:

```rust
use std::sync::atomic;

fn thread1(x: &atomic::AtomicUsize, y: &atomic::AtomicUsize) {
x.store(1, atomic::Ordering::Release);
let v1 = y.load(atomic::Ordering::Acquire);
}
fn thread2(x: &atomic::AtomicUsize, y: &atomic::AtomicUsize) {
y.store(1, atomic::Ordering::Release);
let v2 = x.load(atomic::Ordering::Acquire);
}
```

This code will guarantee that `thread1` sees any writes to `y` made by
`thread2`, and that `thread2` sees any writes to `x`. Intuitively, one
might also expect that if `thread2` sees `v2 == 0`, `thread1` must see
`v1 == 1` (since `thread2`'s store happened before its `load`, and its
load did not see `thread1`'s store). However, the code as written does
*not* guarantee this, because the compiler is allowed to re-order the
store and load within each thread. To enforce this particular behavior,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The hardware can still make this reordering regardless of what the compiler does. A better example might be something like

static FLAG: AtomicUsize = ATOMIC_USIZE_INIT;

// thread1:
// write some stuff non atomically
// need single thread fence here
FLAG.store(1, Ordering::Relaxed);

// thread1 signal handler
if (FLAG.load(Ordering::Relaxed) == 1) {
    // need single thread fence here
    // read stuff
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, you're right. In fact, even the example above is wrong for the same reason (you would need to use volatile_*). Maybe I'm finding myself agreeing more with the original RFC's conclusion that this is not something that people should generally need (at least until Rust start supporting signal handlers)...

a call to `compiler_barrier(Ordering::SeqCst)` would need to be inserted
between the `store` and `load` in both functions.

Compiler barriers with weaker re-ordering semantics (such as
`Ordering::Acquire`) can also be useful, but are beyond the scope of
this text. Curious readers are encouraged to read the Linux kernel's
discussion of [memory barriers][1], as well as C++ references on
[`std::memory_order`][2] and [`atomic_signal_fence`][3].

[1]: https://www.kernel.org/doc/Documentation/memory-barriers.txt
[2]: http://en.cppreference.com/w/cpp/atomic/memory_order
[3]: http://www.cplusplus.com/reference/atomic/atomic_signal_fence/
41 changes: 41 additions & 0 deletions src/libcore/sync/atomic.rs
Original file line number Diff line number Diff line change
Expand Up @@ -1572,6 +1572,47 @@ pub fn fence(order: Ordering) {
}


/// A compiler memory barrier.
///
/// `compiler_barrier` does not emit any machine code, but prevents the compiler from re-ordering
/// memory operations across this point. Which reorderings are disallowed is dictated by the given
/// [`Ordering`]. Note that `compiler_barrier` does *not* introduce inter-thread memory
/// synchronization; for that, a [`fence`] is needed.
///
/// The re-ordering prevented by the different ordering semantics are:
///
/// - with [`SeqCst`], no re-ordering of reads and writes across this point is allowed.
/// - with [`Release`], preceding reads and writes cannot be moved past subsequent writes.
/// - with [`Acquire`], subsequent reads and writes cannot be moved ahead of preceding reads.
/// - with [`AcqRel`], both of the above rules are enforced.
///
/// # Panics
///
/// Panics if `order` is [`Relaxed`].
///
/// [`fence`]: fn.fence.html
/// [`Ordering`]: enum.Ordering.html
/// [`Acquire`]: enum.Ordering.html#variant.Acquire
/// [`SeqCst`]: enum.Ordering.html#variant.SeqCst
/// [`Release`]: enum.Ordering.html#variant.Release
/// [`AcqRel`]: enum.Ordering.html#variant.AcqRel
/// [`Relaxed`]: enum.Ordering.html#variant.Relaxed
#[inline]
#[unstable(feature = "compiler_barriers", issue = "41091")]
pub fn compiler_barrier(order: Ordering) {
unsafe {
match order {
Acquire => intrinsics::atomic_singlethreadfence_acq(),
Release => intrinsics::atomic_singlethreadfence_rel(),
AcqRel => intrinsics::atomic_singlethreadfence_acqrel(),
SeqCst => intrinsics::atomic_singlethreadfence(),
Relaxed => panic!("there is no such thing as a relaxed barrier"),
__Nonexhaustive => panic!("invalid memory ordering"),
}
}
}


#[cfg(target_has_atomic = "8")]
#[stable(feature = "atomic_debug", since = "1.3.0")]
impl fmt::Debug for AtomicBool {
Expand Down