-
Notifications
You must be signed in to change notification settings - Fork 74
1a implementation for toolchains: dummy initializers #314
Comments
No, any recursive type where there is a path in the infinitely unrolled type definition that contains only non-nullable references cannot be constructed by Wasm because it would require being able to refer to a value before it has been initialized. These types can still be potentially created by a host, so they can’t just be optimized entirely out, either. |
I see now, thanks, but it would be still possible (I believe) to have a nullable global variable with the same type, then as part of the initialization of the module call an appropriate host function that returns a dummy value with the correct type and assign it to the global variable. Thanks for highlighting this issue with recursive types, I knew that there had to be this corner case, but I don't think this invalidate the reasoning. |
Yep, that would work fine as long as such a host function exists, but it might not in general. For example the host might only pass in the value as a function parameter, or it might only return it as a result of calling a function with side effects, or it might return a dataref value with one of many possible types depending on external factors. |
Looks like this was suggested by @askeksa-google recently. It seems like it could be useful sometimes. Maybe we should document all the "workaround" options somewhere? I think the main ones that have been discussed are:
1 and 2 may be good enough for a compiler if an optimizer like Aside from those workarounds, the more optimal path is probably to decide what to do in each situation separately. I think that's what we have to do in Binaryen, though we haven't decided how yet, but something along the lines of 2 but only if we can't move code around to avoid the issue, etc. That does take more work, of course. |
Is it only me, or is speccing something half-baked and then discussing and documenting a bunch of workarounds (where one is to not use the feature at all and others are clumsy / suboptimal / impossible) not a desirable outcome? I certainly appreciate all the effort and discussion, specifically the summary document, but when I judge just by the outcome, well... |
Kudos to @askeksa-google then. Having read his comments, the only potential expansion is in potentially providing a JS-based initializer. +1 on documenting workarounds or other relevant by-products of the discussion somewhere, this was the basic idea behind creating the issue (not sure where it's the right place). On dummies, there are a few minor improvements that makes sense, like moving it to an inner scope (but out of loops), or removing the dummy entirely if, after optimizations have been performed, 1a validation would be satisfied anyhow. @dcodeIO: I get the sarcasm, but hurdles for producers vs better runtime characteristics is part of Wasm design process, and here allowing useful proposal (like function references and then strings / GC) to move further makes a lot of sense, and the compromise on 1a makes sense to me. |
Option As Thomas points out, the fact remains that there are some types that programs can write that simply cannot be initialized without either host help or a fixpoint operator. That wouldn't have been solved by any of the alternatives that were available. |
Actually, I'd much prefer if we did not give hosts a mandate to violate the Wasm data model this way. For one, as a general design principal, the host should not have magic powers when it comes to Wasm data structures, as that breaks the virtualisation principle. Second, such an ability could also be abused to construct cyclic data structures for other types, which breaks guarantees that Wasm programs would otherwise get for recursive data types – e.g., that traversing a (non-mutable) recursive type always terminates (in other words, that such recursion is inductive). |
@rossberg, that sounds fine to me, but do we have a way of normatively constraining hosts that way? (Other than in the JS API specifically) |
Yes. We already axiomatise what a host function is allowed to do to the store when called. In particular, the resulting store must be valid and the store extension relation constrains how it can be modified, e.g., not changing immutable things. With GC, these definitions would also talk about the heap, and validity could rule out cyclic data in some suitable way (details tbd). |
Closing this as non-actionable for the MVP design, but I would welcome a PR adding a new document containing notes about how the proposal could be used. |
I sort of lost track of the discussion on non-nullable locals. Really nice to see it has been settled and there is progress on 1a.
After reading discussion on WebAssembly/function-references#44 and also discussion on implementation on bynarien (WebAssembly/binaryen#4824) I realized there might a (potentially obvious) escape hatch for toolchains:
This explicit initialization to a dummy value allows to skip the complexities of having to deal with the validation part for 1a, allowing back things like
or even more complex to legalize like:
without having to force additional restrictions on a given tool IR.
This might not be optimal (as in the added initialization costs 3 instructions per locals that might be avoided), but allows to simplify implementation and could be used either as backstop or as simple way to legalize intermediate (invalid) states.
I briefly discussed with @kripken (https://twitter.com/carlo_piovesan/status/1551834112788398080 and replies), I agree that there are potentially drawbacks / its' not optimal, but I think it might a strategy worth considering enough to be shared here.
One open question I have is whether it's always possible to generate such dummy initializes in advance (it's easy for function references, less obvious for arbitrary potentially recursive types).
The text was updated successfully, but these errors were encountered: