Skip to content

Commit 89a0a71

Browse files
committed
Merge pull request #3 from aturon/stabilize-catch-panic
Edits for clarity
2 parents c9306cd + 8c397d0 commit 89a0a71

File tree

1 file changed

+96
-74
lines changed

1 file changed

+96
-74
lines changed

text/0000-stabilize-catch-panic.md

Lines changed: 96 additions & 74 deletions
Original file line numberDiff line numberDiff line change
@@ -10,14 +10,15 @@ bounds from the closure parameter.
1010

1111
# Motivation
1212

13-
In today's stable Rust it's not currently possible to catch a panic. There are a
14-
number of situations, however, where catching a panic is either required for
15-
correctness or necessary for building a useful abstraction:
13+
In today's stable Rust it's not possible to catch a panic on the thread that
14+
caused it. There are a number of situations, however, where catching a panic is
15+
either required for correctness or necessary for building a useful abstraction:
1616

1717
* It is currently defined as undefined behavior to have a Rust program panic
1818
across an FFI boundary. For example if C calls into Rust and Rust panics, then
1919
this is undefined behavior. Being able to catch a panic will allow writing
20-
robust C apis in Rust.
20+
C apis in Rust that do not risk aborting the process they are embedded into.
21+
2122
* Abstactions like thread pools want to catch the panics of tasks being run
2223
instead of having the thread torn down (and having to spawn a new thread).
2324

@@ -32,9 +33,9 @@ fn catch_panic<F, R>(f: F) -> thread::Result<R>
3233
This function will run the closure `f` and if it panics return `Err(Box<Any>)`.
3334
If the closure doesn't panic it will return `Ok(val)` where `val` is the
3435
returned value of the closure. The closure, however, is restricted to only close
35-
over `Send` and `'static` data. This can be overly restrictive at times and it's
36-
also not clear what purpose the bounds are serving today, hence the desire to
37-
remove these bounds.
36+
over `Send` and `'static` data. These bounds can be overly restrictive, and due
37+
to thread-local storage they can be subverted, making it unclear what purpose
38+
they serve. This RFC proposes to remove the bounds as well.
3839

3940
Historically Rust has purposefully avoided the foray into the situation of
4041
catching panics, largely because of a problem typically referred to as
@@ -45,10 +46,11 @@ Rust.
4546
# Background: What is exception safety?
4647

4748
Languages with exceptions have the property that a function can "return" early
48-
if an exception is thrown. This is normally not something that needs to be
49-
worried about, but this form of control flow can often be surprising and
50-
unexpected. If an exception ends up causing unexpected behavior or a bug then
51-
code is said to not be **exception safe**.
49+
if an exception is thrown. While exceptions aren't too hard to reason about when
50+
thrown explicitly, they can be problematic when they are thrown by code being
51+
called -- especially when that code isn't known in advance. Code is **exception
52+
safe** if it works correctly even when the functions it calls into throw
53+
exceptions.
5254

5355
The idea of throwing an exception causing bugs may sound a bit alien, so it's
5456
helpful to drill down into exactly why this is the case. Bugs related to
@@ -58,11 +60,11 @@ exception safety are comprised of two critical components:
5860
2. This broken invariant is the later observed.
5961

6062
Exceptional control flow often exacerbates this first component of breaking
61-
invariants. For example many data structures often have a number of invariants
62-
that are dynamically upheld for correctness, and the type's routines can
63-
temporarily break these invariants to be fixed up before the function returns.
64-
If, however, an exception is thrown in this interim period the broken invariant
65-
could be accidentally exposed.
63+
invariants. For example many data structures have a number of invariants that
64+
are dynamically upheld for correctness, and the type's routines can temporarily
65+
break these invariants to be fixed up before the function returns. If, however,
66+
an exception is thrown in this interim period the broken invariant could be
67+
accidentally exposed.
6668

6769
The second component, observing a broken invariant, can sometimes be difficult
6870
in the face of exceptions, but languages often have constructs to enable these
@@ -81,7 +83,7 @@ example:
8183
known to not throw an exception.
8284
* Local "cleanup" handlers can be placed on the stack to restore invariants
8385
whenever a function returns, either normally or exceptionally. This can be
84-
done through finally blocks in some languages for via destructors in others.
86+
done through finally blocks in some languages or via destructors in others.
8587
* Exceptions can be caught locally to perform cleanup before possibly re-raising
8688
the exception.
8789

@@ -101,7 +103,7 @@ Up to now we've been talking about exceptions and exception safety, but from a
101103
Rust perspective we can just replace this with panics and panic safety. Panics
102104
in Rust are currently implemented essentially as a C++ exception under the hood.
103105
As a result, **exception safety is something that needs to be handled in Rust
104-
code**.
106+
code today**.
105107

106108
One of the primary examples where panics need to be handled in Rust is unsafe
107109
code. Let's take a look at an example where this matters:
@@ -136,12 +138,15 @@ Rust's design:
136138
* Rust doesn't expose uninitialized memory
137139
* Panics cannot be caught in a thread
138140
* Across threads data is poisoned by default on panics
139-
* Idiomatic Rust must opt in to extra amounts of sharing data across boundaries
141+
* Idiomatic Rust must opt in to extra sharing across boundaries (e.g. `RefCell`)
142+
* Destructors are relatively rare and uninteresting in safe code
140143

141-
With these mitigation tactics, it ends up being the case that **safe Rust code
142-
can mostly ignore exception safety concerns**. That being said, it does not mean
143-
that safe Rust code can *always* ignore exception safety issues. There are a
144-
number of methods to subvert the mitigation strategies listed above:
144+
These mitigations all address the *second* aspect of exception unsafety:
145+
observation of broken invariants. With the tactics in place, it ends up being
146+
the case that **safe Rust code can largely ignore exception safety
147+
concerns**. That being said, it does not mean that safe Rust code can *always*
148+
ignore exception safety issues. There are a number of methods to subvert the
149+
mitigation strategies listed above:
145150

146151
1. When poisoning data across threads, antidotes are available to access
147152
poisoned data. Namely the [`PoisonError` type][pet] allows safe access to the
@@ -156,6 +161,11 @@ number of methods to subvert the mitigation strategies listed above:
156161

157162
[pet]: http://doc.rust-lang.org/std/sync/struct.PoisonError.html
158163

164+
But all of these "subversions" fall outside the realm of normal, idiomatic, safe
165+
Rust code, and so they all serve as a "heads up" that panic safety might be an
166+
issue. Thus, in practice, Rust programmers worry about exception safety far less
167+
than in languages with full-blown exceptions.
168+
159169
Despite these methods to subvert the mitigations placed by default in Rust, a
160170
key part of exception safety in Rust is that **safe code can never lead to
161171
memory unsafety**, regardless of whether it panics or not. Memory unsafety
@@ -166,7 +176,7 @@ this RFC.
166176

167177
# Detailed design
168178

169-
At its heard, the change this RFC is proposing is to stabilize
179+
At its heart, the change this RFC is proposing is to stabilize
170180
`std::thread::catch_panic` after removing the `Send` and `'static` bounds from
171181
the closure parameter, modifying the signature to be:
172182

@@ -177,50 +187,39 @@ fn catch_panic<F: FnOnce() -> R, R>(f: F) -> thread::Result<R>
177187
More generally, however, this RFC also claims that this stable function does
178188
not radically alter Rust's exception safety story (explained above).
179189

180-
### Exception safety mitigation
190+
## Will Rust have exceptions?
181191

182-
A mitigation strategy for exception safety listed above is that a panic cannot
183-
be caught within a thread, and this change would move that bullet to the list of
184-
"methods to subvert the mitigation strategies" instead. Catching a panic (and
185-
not having `'static` on the bounds list) makes it easier to observe broken
186-
invariants of data structures shared across the `catch_panic` boundary, which
187-
can possibly increase the likelihood of exception safety issues arising.
192+
In a technical sense this RFC is not "adding exceptions to Rust" as they already
193+
exist in the form of panics. What this RFC is adding, however, is a construct
194+
via which to catch these exceptions within a thread, bringing the standard
195+
library closer to the exception support in other languages.
188196

189-
One of the key reasons Rust doesn't provide an exhaustive set of mitigation
190-
strategies is that the design of the language and standard library lead to
191-
idiomatic code not having to worry about exception safety. The use cases for
192-
`catch_panic` are relatively niche, and it is not expected for `catch_panic` to
193-
overnight become the idiomatic method of handling errors in Rust.
197+
Catching a panic (and especially not having `'static` on the bounds list) makes
198+
it easier to observe broken invariants of data structures shared across the
199+
`catch_panic` boundary, which can possibly increase the likelihood of exception
200+
safety issues arising.
194201

195-
Essentially, the addition of `catch_panic`:
202+
The risk of this step is that catching panics becomes an idiomatic way to deal
203+
with error-handling, thereby making exception safety much more of a headache
204+
than it is today. Whereas we intend for the `catch_panic` function to only be
205+
used where it's absolutely necessary, e.g. for FFI boundaries. How do we ensure
206+
that `catch_panic` isn't overused?
196207

197-
* Does not mean that *only now* does Rust code need to consider exception
198-
safety. This is something that already must be handled today.
199-
* Does not mean that safe code everywhere must start worrying about exception
200-
safety. This function is not the primary method to signal errors in Rust
201-
(discussed later) and only adds a minor bullet to the list of situations that
202-
safe Rust already needs to worry about exception safety in.
208+
There are two key reasons we don't except `catch_panic` to become idiomatic:
203209

204-
### Will Rust have exceptions?
210+
1. We have already established very strong conventions around error handling,
211+
and in particular around the use of panic and `Result`, and stabilized usage
212+
around them in the standard library. There is little chance these conventions
213+
would change overnight.
205214

206-
In a technical sense this RFC is not "adding exceptions to Rust" as they
207-
already exist in the form of panics. What this RFC is adding, however, is a
208-
construct via which to catch these exceptions, bringing the standard library
209-
closer to the exception support in other languages. Idiomatic usage of Rust,
210-
however, will continue to follow the guidelines listed below for using a Result
211-
vs using a panic (which also do not need to change to account for this RC).
215+
2. We have long intended to provide an option to treat every use of `panic!` as
216+
an abort, which is motivated by portability, compile time, binary size, and a
217+
number of other factors. Assuming we take this step, it would be extremely
218+
unwise for a library to signal expected errors via panics and rely on
219+
consumers using `catch_panic` to handle them.
212220

213-
It's likely that the `catch_panic` function will only be used where it's
214-
absolutely necessary, like FFI boundaries, instead of a general-purpose error
215-
handling mechanism in all code.
216-
217-
# Drawbacks
218-
219-
A drawback of this RFC is that it can water down Rust's error handling story.
220-
With the addition of a "catch" construct for exceptions, it may be unclear to
221-
library authors whether to use panics or `Result` for their error types. There
222-
are fairly clear guidelines and conventions about using a `Result` vs a `panic`
223-
today, however, and they're summarized below for completeness.
221+
For reference, here's a summary of the conventions around `Result` and `panic`,
222+
which still hold good after this RFC:
224223

225224
### Result vs Panic
226225

@@ -229,16 +228,18 @@ today:
229228

230229
* `Results` represent errors/edge-cases that the author of the library knew
231230
about, and expects the consumer of the library to handle.
231+
232232
* `panic`s represent errors that the author of the library did not expect to
233-
occur, and therefore does not expect the consumer to handle in any particular
234-
way.
233+
occur, such as a contract violation, and therefore does not expect the
234+
consumer to handle in any particular way.
235235

236236
Another way to put this division is that:
237237

238238
* `Result`s represent errors that carry additional contextual information. This
239239
information allows them to be handled by the caller of the function producing
240240
the error, modified with additional contextual information, and eventually
241241
converted into an error message fit for a top-level program.
242+
242243
* `panic`s represent errors that carry no contextual information (except,
243244
perhaps, debug information). Because they represented an unexpected error,
244245
they cannot be easily handled by the caller of the function or presented to
@@ -251,14 +252,13 @@ and writing down `Result` + `try!` is not always the most ergonomic.
251252

252253
The pros and cons of `panic` are essentially the opposite of `Result`, being
253254
easy to use (nothing to write down other than the panic) but difficult to
254-
determine when a panic can happen or handle it in a custom fashion.
255-
256-
### Result? Or panic?
255+
determine when a panic can happen or handle it in a custom fashion, even with
256+
`catch_panic`.
257257

258258
These divisions justify the use of `panic`s for things like out-of-bounds
259259
indexing: such an error represents a programming mistake that (1) the author of
260-
the library was not aware of, by definition, and (2) cannot be easily handled by
261-
the caller.
260+
the library was not aware of, by definition, and (2) cannot be meaningfully
261+
handled by the caller.
262262

263263
In terms of heuristics for use, `panic`s should rarely if ever be used to report
264264
routine errors for example through communication with the system or through IO.
@@ -270,11 +270,28 @@ could report the error in terms a they can understand. While the error is
270270
rare, **when it happens it is not a programmer error**. In short, panics are
271271
roughly analogous to an opaque "an unexpected error has occurred" message.
272272

273-
Another key reason to choose `Result` over a panic is that the compiler is
274-
likely to soon grow an option to map a panic to an abort. This is motivated for
275-
portability, compile time, binary size, and a number of other factors, but it
276-
fundamentally means that a library which signals errors via panics (and relies
277-
on consumers using `catch_panic`) will not be usable in this context.
273+
Stabilizing `catch_panic` does little to change the tradeoffs around `Result`
274+
and `panic` that led to these conventions.
275+
276+
## Why remove the bounds?
277+
278+
The main reason to remove the `'static` and `Send` bounds on `catch_panic` is
279+
that they don't actually enforce anything. Using thread-local storage, it's
280+
possible to share mutable data across a call to `catch_panic` even if that data
281+
isn't `'static` or `Send`. And allowing borrowed data, in particular, is helpful
282+
for thread pools that need to execute closures with borrowed data within them;
283+
essentially, the worker threads are executing multiple "semantic threads" over
284+
their lifetime, and the `catch_panic` boundary represents the end of these
285+
"semantic threads".
286+
287+
# Drawbacks
288+
289+
A drawback of this RFC is that it can water down Rust's error handling story.
290+
With the addition of a "catch" construct for exceptions, it may be unclear to
291+
library authors whether to use panics or `Result` for their error types. As we
292+
discussed above, however, Rust's design around error handling has always had to
293+
deal with these two strategies, and our conventions don't materially change by
294+
stabilizing `catch_panic`.
278295

279296
# Alternatives
280297

@@ -306,4 +323,9 @@ panics by default (like poisoning) with an ability to opt out (like
306323

307324
# Unresolved questions
308325

309-
None currently.
326+
- Is it worth keeping the `'static` and `Send` bounds as a mitigation measure in
327+
practice, even if they aren't enforceable in theory? That would require thread
328+
pools to use unsafe code, but that could be acceptable.
329+
330+
- Should `catch_panic` be stabilized within `std::thread` where it lives today,
331+
or somewhere else?

0 commit comments

Comments
 (0)