Skip to content
This repository was archived by the owner on Apr 25, 2025. It is now read-only.
This repository was archived by the owner on Apr 25, 2025. It is now read-only.

Reflecting on implicit exceptions #149

Open
@RossTate

Description

@RossTate

Preface: These are ideas I came up with a couple months ago but kept to myself in order to avoid stirring up more controversy. But I'd like to at least express the ideas. So do not take this "Issue" as a "request for change"; if GitHub Discussions were enabled, this would be posted there instead. Also, I recognize that that the following applies a lot of hindsight.

From what I can tell, now that I understand the community's needs and desires better, a simple change to the type system (rather than just the instruction design) could have boiled exception handling down to two instructions (neither of which are blocks):

  • throw $exn: throws an $exn exception with the values on the value stack as its payload
  • catching $exn $label: precedes a call-like instruction and modifies it to branch to $label should an $exn get thrown by the call (dumping the contents of the payload onto the value stack)—otherwise the thrown exception gets propagated to the caller

The corresponding change to the type system is that function types are extended with a throws clause that lists the exception events that can be thrown by the function. By default, C functions would have the (throws $__c_longjmp) clause (because longjmp would still have to be emulated by an exception), C++ functions would have the (throws $__c_longjmp $__cpp_exception) clause (where the two are kept distinct because the former should always succeed and only the latter needs to cause destructors to fire), and Java functions would have the (throws $__java_exception) clause (with the throws clause in the surface Java code being completely ignored).

Why does this type-system change enable a simpler instruction set? Two reasons, the second of which relies on how this type-system change also makes the JS/C APIs simpler.

The first is that WebAssembly code no longer has to deal with unknown exceptions. catch_all, then externref, and then catch_all/unwind were all introduced to deal specifically with unknown exceptions. One might think that they were meant to enable reuse of unwinding code, but notice that in the above examples there is only ever one exception event in a throws clause that's supposed to trigger unwinding. So really they're intended to deal with unknown exceptions, and so making all exceptions explicit eliminates the need for these constructs.

Before going into the second reason, let us consider interaction with the host. From a C perspective, when calling a WebAssembly function from C, the throws clause essentially informs the C-caller of a wasm-exported function how many and what type of alternate return addresses it should provide (or how many "result" structs to pass it, with the returned value indicating which result struct to use). From a JS perspective, the throws clause prevents WebAssembly exceptions and JS exceptions from crossing the boundary. Instead, boundary crossing is explicit and restricted, and the WebAssembly module itself is responsible for converting (using various imports, possibly including imported exception events) between its own exception event(s) and whichever exceptions the boundary permits.

Given that need for explicit conversion, the second enabler of simplification is that WebAssembly code is forced to handle stack traces explicitly. For example, a $__cpp_exception will not be able to arbitrarily cross the boundary into JS; rather, it will need to be caught by WebAssembly code, likely in the boundary code already generated by tools for interoping with JS. This means there's no utility in having additional content implicitly associated with a wasm exception. If you want to associate stack traces with an exception, you create the stack trace using an imported function and make the resulting value part of the payload of your own exception event. If your exception reaches your boundary code, then your code for converting your exceptions into JS exceptions should be extended to include a preexisting stack trace, which the debugger can then make use of. (This has the added advantage that the imported function used to create stack traces could alternatively be supplied with a "return null" function for faster performance during deployment, and that other languages not needing stack traces at all can get faster performance by default, which research indicates can be a 6x improvement for some languages.)

Putting these together, we eliminate all need for new block instructions, which are the hardest to generate. catch_all and unwind need to be block-based so that they can rethrow unknown exceptions, which no longer exist. catch needs to be block-based so that it can retain implicit additional exception content (i.e. stack traces), which no longer exists. (Consequently, rethrow is no longer necessary.) try needs to be block-based because it needs to delimit the scope of these large special handlers, which are no longer special and can be just expressed with a label. (Consequently, delegate is no longer necessary.)

Thus just throw and catching can express everything these constructs can express whenever all exceptions are known, which the throws clause provides.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions