-
Notifications
You must be signed in to change notification settings - Fork 695
Stack Inspection #1356
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
@RossTate the Lumen Core Team is very interested in getting stack switching as it would allow us to use stack switching for Erlang Process switching for both native and WASM and means we don't need code transformations in our MLIR and LLVM code for WASM. Can you expand on how stack switching would work with this? |
@KronicDeth So I don't want to get into detail on stack switching since we'll have a detailed proposal for it up once the question of whether to do stack inspection is resolved. Nonetheless I'll give you a high-level picture of how they relate. With first-class stacks, a stack is composed of a bunch of sub-stacks that have been attached together. For various applications, it's useful to be able to detach a certain portion of the stack from the rest of the stack (e.g. to save it for later). The question is which attachment point do you want to attach (and where is it in the current stack). You generally answer this question by doing a stack inspection of the current stack to find the right point. That said, if I've come to understand Erlang well enough, you probably aren't interested in that aspect of first-class stacks. You want the ability to switch directly between (complete) stacks without having to do an inspection. Our proposal will provide that functionality as well (though it took a while to figure out how to get that all to be safe and respect current standards in the presence of trapping and the like). |
So, this complete vs partial stacks is actually a little bit of annoying "feature" we had in the native implementation for a bit - for stack traces we'd really like the |
You'll be given the freedom to pick whichever implementation you find works best. With stack inspection, the primitives would be expressive enough that you could use the faster stack-switching mechanism but still report the stack trace for both the current Erlang process and the scheduler. |
Back to the topic of stack inspection, let's talk about implementation. If WebAssembly were actually an assembly language, there would be no notion of functions. Instead you would just have blocks of instructions that operate on some state. Some of that state could be a stack, and a block of instructions implementing a The instructions implementing an The point is that an Of course, there's good reason WebAssembly bundles stack-state and code-state. The heterogeneous mix of types and the stateful machine-dependent packing of local variables makes it hard to express a true-assembly approach efficiently on an abstract machine, and it makes it difficult for a host to independently verify that the true-assembly code is safe. Furthermore, the stack is an interlanguage resource used to facilitate interop between WebAssembly and its embedding language, e.g. JavaScript, and even between WebAssembly modules. But those are also the reasons why it would be problematic for WebAssembly modules to implement stack inspection with a shadow stack. C++ is able to use a shadow stack because everything its stack (currently) needs to track lives in linear memory, solving the heterogeneity problem. But most languages are garbage collected, and we want to encourage them to use the host's GC, which means their local variables will be a mix of various reference types and numeric types. This means implementing a shadow stack will involve a lot of casting, dynamic allocation, and copying. (It'd be a sad day if it turns out that using the GC proposal is slower because of the significant overhead it incurs on implementing the shadow stack.) And even after all that effort, your shadow stack won't interop well with the embedding language or other modules, and for separate composition (to WebAssembly) you'll also have to expose your shadow stack so that other modules compiled from your language can share it, breaking encapsulation. So |
|
Definitely not. This is about stack inspection. Although stack inspection can be used to support the first phase of two-phase exception handling, the EH proposal is focused on single-phase exception handling (including unwinding, i.e. the second phase of two-phase exception handling), and I don't intend to change that focus (beyond WebAssembly/exception-handling#123 already looking into making it forwards compatible with two-phase exception handling).
Sure. I'll try to get one written up shortly (though it's late here, so possibly not until tomorrow). |
I'm confused. If we are to implement two-phase unwinding using stack inspection while keeping the current EH proposal focused on single-phase, why making it compatible with (or extensible to) to two-phase? I thought your intention was to add two-phase on top of the current EH proposal and to make the difference between the current and the future two-phase proposal smaller. |
That is my intention. For two-phase EH, stack inspection would be used to determine (while preserving the stack) which handler to use, terminating the inspection via some instruction that effectively says "unwind the stack and then transfer control to target |
In the first phase of EH,
Also I'm not sure why we would need this kind of instruction. The transition from first to second phase can be done automatically when the first phase finds a handler, and that's how most (if not all) EH schemes are implemented I think. |
Here's some C# code:
Here's the WebAssembly it would compile to with the
A C# |
How does it know the destination stack even before it runs filters and such? What makes this exception special? Also, other than you changed |
@aheejin This issue is about stack inspection as a high-level consideration. You are digging into specifics that belong in something more focused on two-phase exception handling. That could be WebAssembly/exception-handling#123, but you asked me not to talk about this there. If you want to dig into these details more, please create a separate issue. @rossberg Please illustrate how you plan to support these applications. Your main objection seemed to be that you had some prior mechanism. However, the only one that you've presented that I can imagine might be related is continuations, but that is far too heavy handed and inefficient for these applications. |
@RossTate could we maybe have a small but complete example? Let's imagine a simple function somewhere in the middle of the calls stack (has a caller and a function it calls), does some simple computation, and provide answer code that simply calls an imaginary Maybe some indication on how an engine would typically implement this (presumably |
Let's consider single-phase exception handling. Engines conceptually implement single-phase exception handling by first compiling the function with each
Then the Nearly everything is the same for
|
Thanks, that's a good comparison, but it still leaves me a lot of questions:
|
Good questions. I think I can do the first three all together: A
No restrictions. So essentially, an
This broadens the scope of the discussion to stack walking. More general stack walking is certainly useful ( |
I think at least two of my three previous questions are not about EH specifics.
This was the question to the sentences to your paragraph in #1356 (comment):
This is also not about EH specifics. I simply didn't understand how the functionality of these instructions are different from the existing EH instructions (other than it has filters). I was asking what additional functionality these instructions provide. |
I personally think this would be a very versatile feature that could enable all sorts of languages features. Though, from my understanding of @RossTate's explanation above, I think it would good to explore what potential costs would be. A function that has an answer inside of it needs to "inspectable", which means that, if it does register allocation at all, it needs to spill these back to the exact stack slots they came from, as opposed current implementations which may choose to have a local live in a register only (and never even have a stack slot), and use whatever register caller/callee saving method is efficient. On top of that there's the locals only used in the answer, which just make the stack usage slightly bigger, which may have indirect costs (cache efficiency). Now such costs are minor in the case of exception handling, since the amount of functions containing a try-catch is a small fraction, but there are use cases for stack inspection where pretty much every function would need an answer, like stack trace printing or linear memory GC. In addition to the above (minor) costs, there's the issue of code size, if every function has an answer. You'd want to have these answers share an implementation of different locals "signatures" to reduce code bloat, which you could do by having the answer directly call a function for any unique signature. But even calling such a function with all locals will produce a fair bit of code in each function, so I wonder if specifying answer functions completely external to each function could be better. Of course, that is a much more intrusive design change to the existing Wasm design. |
Good considerations. But do take into account that all of those costs would be (substantially) worse without stack inspection. That is, languages need to compile these features to WebAssembly in some way or another, and stack inspection is the most efficient, compact, and composable means to do so. (That said, I also agree that it's worthwhile to investigate if there are ways to support particularly common patterns even more compactly.) |
Absolutely agreed. While I am just guessing (until we have representative benchmarks), my bet would be that "just use a shadow stack instead" entails a huge performance hit compared to this feature. I just want to be clear what the consequences could be. |
This is not quite accurate. A responder needs to know where the local variables are. E.g., if a compiler decided to put a given local variable in a register instead of a stack slot -- in a region where there is a responder -- then the compilation of that responder would simply rely on the variable location. Of course, that depends on the variable being in the same location thoughout the region (it does not need to be a stack slot, it just needs to be consistent). |
@fgmccabe yes, that makes sense. Though needing to have the same register assignment (and the same way to save them) across every call in a function is still potentially a restriction. But I'd agree that it could generally be very minor. The ability for the engine to still do register allocation (and make choices local to a function, as opposed to a uniform layout for all) is probably one of the major things that sets this apart from a shadow stack solution. |
My impression was that there was interest in providing stack inspection, but that interest was reserved until having a better understanding of what it would entail. So this discussion is to explore the design space a bit more and understand what considerations need to be taken into account more thoroughly.
From my perspective, the key functionality to add is
answer $dispatch_tag instr1* within instr2* end
dispatch tag $dispatch_tag : [ti*] -> [to*]
instr1* : [ti*] -> [to*]
instr2*
, during which answering any stack search for$dispatch_tag
withinstr1*
where there is some way to look for these answers in the current stack, such as
call_stack $dispatch_tag : [ti*] -> [to*]
dispatch tag $dispatch_tag : [ti*] -> [to*]
answer $dispatch_tag
and calls itApplications we should consider here are stack tracing, linear-memory gc-root collection and updating, and two-phase exception handling. This also pertains to continuations and stack switching, but as those are less familiar concepts and unnecessary for these applications, I suggest not using them as examples, at least initially.
@rossberg My sense was that your main concern was that you had some unifying alternative in mind. Would you mind writing it up to inform the discussion?
Edit: Jump to here and the next two comments for a better implementation-focused description of
answer
prompted from some good questions from @aardappel.The text was updated successfully, but these errors were encountered: