-
Notifications
You must be signed in to change notification settings - Fork 52
What's the relationship between Wasm and WASI threading proposal? #138
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
This proposal does not add any way for a wasm module to create its own threads; the wasm module must import this functionality from the host. Currently, this is achieved in a browser environment by importing JS functions that use the Web Worker API, as described in the overview. As a portable host interface, I think it would make sense for WASI to also define a portable thread creation API too (that could be polyfilled in a browser with workers). There is a longer-term plan to add thread-creation operators to core wasm, but, for various connected reasons, this will take a while because it ends up requiring not just linear memory, but also tables, globals and instances to be shared. Pure-wasm threads won't be a 100% replacement for host-created threads, though, since host-created threads can do extra host-specific stuff, like provide an event loop, so I don't think the WASI thread-creation APIs are just a stopgap; they'd have long-term utility. |
Hm, why is that? I understand that more sharing would be useful, but to match whatever a host can do atm it shouldn't be necessary, I think. (However, in such a scenario it would be necessary to have a module instantiation instruction and first-class tables, globals, etc. to support it.) |
@rossberg That's a good question. I've been assuming, from previous discussions, that what we want is some sort of But it sounds like you're imagining a |
@lukewagner, it's simpler actually: the same operator you describe plus a separate instantiate instruction. Because the spawned function must not be allowed to access non-shared state of its module, all it could effectively do would be instantiating a new module (with a separate instruction), similar to how it currently works on the host side. That's how we model it in our memory model paper draft anyway. There, we have a fork instruction that requires a function of shared function type and a separate instantiate instruction. For wiring up imports/exports we simply reify externvals as anyref. We also introduce shared tables etc, but they are not needed to emulate the current host semantics. |
If it's a question of "to bytecode or not to bytecode" I think I would prefer that we not have bytecodes that create instances or deal with modules. The reason for that is that those necessitate types and first class values for modules and instances, which are necessarily embedder concepts. So they would be imported types or "standardized" reference types--though likely opaque. In the continuing spirit of not baking any non-trivial types into core wasm, then I think it's better that types for instances and modules remain embedder concepts that must be imported. That only leaves room for bytecodes that do not need to refer to these in a first class way, a variant of what Luke suggested. But, assuming we had So it seems like standardizing threading bytecodes are going to inevitably lead to a set of opaque reference types in any case. |
@rossberg When you say "the same operator you describe", do you mean the first version of |
@titzer, you don't even need first-class instances, only first-class memories, tables, globals, such that the Depending on how dynamic we'd want to make the link-time type-checking, that wouldn't require fancy types either. In our threads paper we even use plain anyref, which is no worse than what we have in JS. You could polyfill that instruction with a call to JS imports today.
Does it have to return anything? You want to give it access to shared memory anyway (and other shared defs if we had them). That would be enough. |
Yep. In the paper we take a funcidx and parameters. (Using a funcref can easily be expressed with an auxiliary function.)
Via validation (see paper, Appendix A if you're interested). In our system, function types have a shared attribute as well, and the instruction (we call it Regardless of the details, a
Oh, executed in a new thread. |
@rossberg: generally, yes, it is good to get a handle on the spawned computation, e.g. to perhaps await it, join it, cancel it, etc. |
@rossberg Ah, interesting; I had been imagining that there was only a "shared" attribute on the whole module/instance, with that requirement propagating to its memories/tables/globals. Are there uses you can think of for having the "shared" attribute be per-function other than fork? |
@lukewagner, there might be use cases where a module has both shared and unshared exports. But the primary reason for putting the attribute on the function type is that we'd need to track it in function types anyway, because function references are first-class, so you don't know what module they come from. |
Yes, definitely makes sense to track that in the function reference type; I was mostly just asking about granularity (module vs. function). |
WAVM has some non-standard support for shared instances (and tables) at the C API level. One way that it differs from the shared functions @rossberg is talking about is that functions in shared instances can access non-shared globals. The semantics are equivalent to re-instantiating the module in each thread in the same compartment: non-shared mutable globals become thread-locals. IMO adding a way to directly create threads from WebAssembly is only superficially valuable, and the next step after this shared memory extension should be to tackle shared instances. If this is something browser folks don't want to take on yet, it might be possible to do it in a constrained way that can be polyfilled on web VMs. |
@lukewagner, my thinking was that if each function declares it anyway (as part of its type), then what's the use of also having a mode per module? Also, I always want to avoid per-module modes/flags, since they would get in the way of module merging, and thus modular (de)composition. |
@AndrewScheidecker, silently duplicating state seems dangerous, since it can arbitrarily break state invariants the module is assuming. I think that should at least be gated by some third form of sharing attribute, like TLS. |
It's not silent, it's controlled by whatever host API is being used to create threads. If the host API is implemented on the web by re-instantiating the module in a new WebWorker, then WAVM can reproduce that behavior by creating a new context. I do think it makes sense to add a thread-local sharing attribute alongside shared functions. How does segment drop state interact with shared functions? Non-shared segments don't seem useful, so maybe segments should just be implicitly shared. |
Sure, but the module itself has no way of controlling this and preventing a random client from breaking it that way. It is violating state encapsulation.
Good question. I agree that they should probably be shared. They are typically accessed for relatively expensive operations only, so the additional synchronisation on retrieving the address shouldn't be prohibitive. |
@rossberg Motivating per-function via trivial-module-merging is a great point. |
what's the status on this? |
No description provided.
The text was updated successfully, but these errors were encountered: