std: Return Result from RWLock/Mutex methods #19661

alexcrichton · 2014-12-09T15:39:27Z

All of the current std::sync primitives have poisoning enable which means that
when a task fails inside of a write-access lock then all future attempts to
acquire the lock will fail. This strategy ensures that stale data whose
invariants are possibly not upheld are never viewed by other tasks to help
propagate unexpected panics (bugs in a program) among tasks.

Currently there is no way to test whether a mutex or rwlock is poisoned. One
method would be to duplicate all the methods with a sister foo_catch function,
for example. This pattern is, however, against our error guidelines.
As a result, this commit exposes the fact that a task has failed internally
through the return value of a Result.

All methods now return a LockResult<T> or a TryLockResult<T> which
communicates whether the lock was poisoned or not. In a LockResult, both the
Ok and Err variants contains the MutexGuard<T> that is being returned in
order to allow access to the data if poisoning is not desired. This also means
that the lock is always held upon returning from .lock().

A new type, PoisonError, was added with one method into_guard which can
consume the assertion that a lock is poisoned to gain access to the underlying
data.

This is a breaking change because the signatures of these methods have changed,
often incompatible ways. One major difference is that the wait methods on a
condition variable now consume the guard and return it in as a LockResult to
indicate whether the lock was poisoned while waiting. Most code can be updated
by calling .unwrap() on the return value of .lock().

[breaking-change]

aturon · 2014-12-14T05:19:50Z

cc @wycats @nikomatsakis

We don't necessarily have to add this now, but it seems like you might want some additional flexibility:

The ability to recover use of a lock after it has been poisoned. For example, if you overwrite the data protected by the lock with a new value.
Possibly, the ability to read from a poisoned lock. This could be done by returning the previously-held data as part of the Err component -- basically alerting you to the fact that the data is suspect due to another thread's panic, but not preventing you from accessing it outright. Of course, you can work around this by protecting something like () and having the real data stored in an UnsafeCell, but that's not very ergonomic.

What do you think about these cases?

As an aside, I still feel slightly uneasy about the division of places where we try to help with exception safety and places where we don't. For example, a function down the stack could panic while using an &mut reference to data that is owned higher up the stack, and the latter's destructor will see it in whatever state it was left in. Granted, in that case you're talking purely about destructors during unwinding, whereas with mutexes the data may be seen at an arbitrary point in time. But it would be good to try to lay out the distinction very clearly, in docs or a blog post or something.

aturon · 2014-12-14T05:21:09Z

(Otherwise, the PR looks like a fine implementation of what was discussed at the workweek.)

alexcrichton · 2014-12-14T18:02:59Z

Thinking more on this, I may be personally leaning more towards removing poisoning altogether or moving it to an opt-in primitive. For example, I can think of a few drawbacks:

In returning Result, I forgot that you can ignore the result of Condvar::wait which means you can read stale data even though it's been panicked over. This seems like a blocker for this PR.
We could plausibly add PoisonError::replace, but this causes a bit of a schism in the RWLock api. For example the error type on read() would have to be different than write() (you can only replace on the write lock). It also means that the PoisonError itself would have to carry the payload of the lock guard, which may mean that it's difficult to use with try! as it means that the type can't implement Error and you'd have to get rid of the borrowed reference quickly to propagate upwards.
I, too, am a bit worried that we have poisoning on a Mutex but not a RefCell, for example. We've said in the past that intra-task inspection of stale state is semi-ok (why RefCell doesn't poison) while inter-task inspection is bad (which is why Mutex panics). All these issues about panicking and recovering from this with a Mutex and RWLock seem too tricky to stabilize on, however.

sfackler · 2014-12-14T20:51:26Z

The more I think about it, the more removing poisoning makes sense to me.

thestinger · 2014-12-15T01:11:14Z

This is making everyone pay for the fact that some people want to use exception handling. Harming the API for everyone in order to support this niche feature doesn't make sense.

thestinger · 2014-12-15T01:12:19Z

@sfackler: Poisoning could be removed, but that's the end of restricted exception handling. It would mean catching an exception in the same task would have to be safe too. I think that's saner than pretending Rust doesn't have exception handling.

alexcrichton · 2014-12-22T20:45:12Z

re-r? @aturon

I've updated the PR description/commit message to reflect the tweaked state of affairs from our discussion.

aturon · 2014-12-29T17:08:48Z

src/libstd/sync/poison.rs

-    panicking: bool,
+#[allow(missing_copy_implementations)]
+pub struct Guard {
+    panicking: uint,


Why uint?

Oops, I think I forgot to remove this after playing around with it. Whatever reason I had to put this here originally is no longer valid!

All of the current std::sync primitives have poisoning enable which means that when a task fails inside of a write-access lock then all future attempts to acquire the lock will fail. This strategy ensures that stale data whose invariants are possibly not upheld are never viewed by other tasks to help propagate unexpected panics (bugs in a program) among tasks. Currently there is no way to test whether a mutex or rwlock is poisoned. One method would be to duplicate all the methods with a sister foo_catch function, for example. This pattern is, however, against our [error guidelines][errors]. As a result, this commit exposes the fact that a task has failed internally through the return value of a `Result`. [errors]: https://github.com/rust-lang/rfcs/blob/master/text/0236-error-conventions.md#do-not-provide-both-result-and-fail-variants All methods now return a `LockResult<T>` or a `TryLockResult<T>` which communicates whether the lock was poisoned or not. In a `LockResult`, both the `Ok` and `Err` variants contains the `MutexGuard<T>` that is being returned in order to allow access to the data if poisoning is not desired. This also means that the lock is *always* held upon returning from `.lock()`. A new type, `PoisonError`, was added with one method `into_guard` which can consume the assertion that a lock is poisoned to gain access to the underlying data. This is a breaking change because the signatures of these methods have changed, often incompatible ways. One major difference is that the `wait` methods on a condition variable now consume the guard and return it in as a `LockResult` to indicate whether the lock was poisoned while waiting. Most code can be updated by calling `.unwrap()` on the return value of `.lock()`. [breaking-change]

aturon · 2014-12-29T17:19:28Z

r=me after the nit. I think this actually came out really nicely!

reem · 2014-12-29T17:31:50Z

I'm very against this change - in <99% of cases this just replaces .lock() with .lock().unwrap() which I think is a needless ergonomic regression. I think we would be much better served by a family of try_ methods like in RefCell.

alexcrichton · 2014-12-29T17:58:24Z

Currently we're not planning on stabilizing the try_* variants of methods on RefCell due to them being against our current conventions of not providing both variants. For RefCell I believe we plan on adding some is_borrowed or related methods to query the state, and then an action can be taken.

Unfortunately the same logic cannot necessarily be applied to synchronization primitives because you're unable to perform an atomic "is this ready to go, ok let's go" operation. As a result, we must provide functionality as one method, and our conventions dictate that the method must return a Result (the more general of the two).

The Mutex and RefCell types are pretty different in their use cases despite their similarities. For example a RefCell was chosen to panic by default because it's believed that its usage is quite common, and .borrow().unwrap() would simply be untenable. When contrasted with Mutex, it's actually fairly rare to use a Mutex. Additionally, there's an intrinsic pressure to reduce the number of calls to lock() already (for less serialization), but this pressure does not really exist for RefCell. As a result, the usage of Mutex is actually somewhat uncommon (depending on your application of course).

Consequently, this PR chooses to have mutexes return results (and later have channels return results]) due to the relatively uncommon usage of the primitives, and the richness of information that the error type can convey.

If we were to ignore our convention in this respect, and still provide both variants, this also generates a very real naming problem. For example channels right now use foo and foo_opt as the names of the pair of methods, but this is seen as inadequate (_opt has nothing to do with options). The try_ prefix cannot be used because for synchronization primitives it has a different meaning than RefCell in that it generally means "this does not block." By sticking to our conventions we sidestep this question altogether and create what's actually a pretty uniform interface across all primitives.

We definitely do not want to create "needless ergonomic regressions" when stabilizing primitives, we try to always focus on enhancing functionality and having primitives adhere to existing conventions. This change is directly aligning with our error conventions, and it is clearly indicating the semantics of locks which may have been somewhat hidden before.

reem · 2014-12-29T18:09:18Z

I'm sympathetic to the naming issue as well as the racy nature of "is ready" checks around synchronized structures - I just don't particularly like having operations where in the majority of cases the "right" thing to do is unwrap the result.

This also reduces the quality of the logged error from something about mutex's panicking to "unwrap on None", but that can be mostly mitigated by setting RUST_BACKTRACE.

alexcrichton · 2014-12-29T18:12:35Z

Note that special care is taken here with the messages generated by .unwrap() on the return value, they should basically be exactly the same as before.

This commit performs a stabilization pass over the sync::{mutex, rwlock, condvar} modules, marking the following items as stable: * Mutex * Mutex::new * Mutex::lock * Mutex::try_lock * MutexGuard * RWLock * RWLock::new * RWLock::read * RWLock::try_read * RWLock::write * RWLock::try_write * RWLockReadGuard * RWLockWriteGuard * Condvar * Condvar::new * Condvar::wait * Condvar::notify_one * Condvar::notify_all * PoisonError * TryLockError * TryLockError::Poisoned * TryLockError::WouldBlock * LockResult * TryLockResult The following items remain unstable to explore future possibilities of unifying the static/non-static variants of the types: * StaticMutex * StaticMutex::new * StaticMutex::lock * StaticMutex::try_lock * StaticMutex::desroy * StaticRWLock * StaticRWLock::new * StaticRWLock::read * StaticRWLock::try_read * StaticRWLock::write * StaticRWLock::try_write * StaticRWLock::destroy The following items were removed in favor of `Guard<'static, ()>` instead. * StaticMutexGuard * StaticRWLockReadGuard * StaticRWLockWriteGuard

alexcrichton · 2014-12-29T22:43:28Z

re-r? @aturon the last commit? It places the #[stable] tag on most APIs touched here.

All of the current std::sync primitives have poisoning enable which means that when a task fails inside of a write-access lock then all future attempts to acquire the lock will fail. This strategy ensures that stale data whose invariants are possibly not upheld are never viewed by other tasks to help propagate unexpected panics (bugs in a program) among tasks. Currently there is no way to test whether a mutex or rwlock is poisoned. One method would be to duplicate all the methods with a sister foo_catch function, for example. This pattern is, however, against our [error guidelines][errors]. As a result, this commit exposes the fact that a task has failed internally through the return value of a `Result`. [errors]: https://github.com/rust-lang/rfcs/blob/master/text/0236-error-conventions.md#do-not-provide-both-result-and-fail-variants All methods now return a `LockResult<T>` or a `TryLockResult<T>` which communicates whether the lock was poisoned or not. In a `LockResult`, both the `Ok` and `Err` variants contains the `MutexGuard<T>` that is being returned in order to allow access to the data if poisoning is not desired. This also means that the lock is *always* held upon returning from `.lock()`. A new type, `PoisonError`, was added with one method `into_guard` which can consume the assertion that a lock is poisoned to gain access to the underlying data. This is a breaking change because the signatures of these methods have changed, often incompatible ways. One major difference is that the `wait` methods on a condition variable now consume the guard and return it in as a `LockResult` to indicate whether the lock was poisoned while waiting. Most code can be updated by calling `.unwrap()` on the return value of `.lock()`. [breaking-change]

Main thing is unwrapping RWLock return values -- see: rust-lang/rust#19661 The other stuff is moving off deprecated iterator API to avoid warnings. TODO: we may need to pull kiss3d_recording out to a separate cargo project, since it appears Cargo can only produce one lib crate per project.

internal: Upgrade to the next Salsa

alexcrichton force-pushed the mutex-result branch from 3f72cff to 762318b Compare December 11, 2014 18:00

alexcrichton force-pushed the mutex-result branch 3 times, most recently from 427c2bc to c16ef0b Compare December 22, 2014 20:44

rust-highfive assigned aturon Dec 22, 2014

alexcrichton force-pushed the mutex-result branch 2 times, most recently from 0070659 to dcc862c Compare December 29, 2014 17:05

aturon reviewed Dec 29, 2014
View reviewed changes

alexcrichton force-pushed the mutex-result branch from dcc862c to 76e5ed6 Compare December 29, 2014 17:19

aturon mentioned this pull request Dec 29, 2014

Stabilization metabug: 1.0-alpha #19260

Closed

bors merged commit 35e63e3 into rust-lang:master Dec 30, 2014

blevin mentioned this pull request Jan 3, 2015

Update to rustc 0.13.0-nightly (39d740266 2015-01-01 15:51:08 +0000) sebcrozet/kiss3d#43

Merged

alexcrichton deleted the mutex-result branch January 7, 2015 05:15

lnicola pushed a commit to lnicola/rust that referenced this pull request Apr 28, 2025

Merge pull request rust-lang#19661 from ChayimFriedman2/salsa-v20

7b249d2

internal: Upgrade to the next Salsa

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

std: Return Result from RWLock/Mutex methods #19661

std: Return Result from RWLock/Mutex methods #19661

alexcrichton commented Dec 9, 2014

aturon commented Dec 14, 2014

aturon commented Dec 14, 2014

alexcrichton commented Dec 14, 2014

sfackler commented Dec 14, 2014

thestinger commented Dec 15, 2014

thestinger commented Dec 15, 2014

alexcrichton commented Dec 22, 2014

aturon Dec 29, 2014

alexcrichton Dec 29, 2014

aturon commented Dec 29, 2014

reem commented Dec 29, 2014

alexcrichton commented Dec 29, 2014

reem commented Dec 29, 2014

alexcrichton commented Dec 29, 2014

alexcrichton commented Dec 29, 2014

std: Return Result from RWLock/Mutex methods #19661

std: Return Result from RWLock/Mutex methods #19661

Conversation

alexcrichton commented Dec 9, 2014

aturon commented Dec 14, 2014

aturon commented Dec 14, 2014

alexcrichton commented Dec 14, 2014

sfackler commented Dec 14, 2014

thestinger commented Dec 15, 2014

thestinger commented Dec 15, 2014

alexcrichton commented Dec 22, 2014

aturon Dec 29, 2014

Choose a reason for hiding this comment

alexcrichton Dec 29, 2014

Choose a reason for hiding this comment

aturon commented Dec 29, 2014

reem commented Dec 29, 2014

alexcrichton commented Dec 29, 2014

reem commented Dec 29, 2014

alexcrichton commented Dec 29, 2014

alexcrichton commented Dec 29, 2014