Skip to content
This repository was archived by the owner on Apr 25, 2025. It is now read-only.

Remove parametric polymorphism from Post-MVP #179

Closed
wants to merge 3 commits into from

Conversation

RossTate
Copy link
Contributor

The Post-MVP proposes to add parameteric polymorphism for the sake of supporting generics. However, #156 provided multiple reasons why the Post-MVP would not be able to express languages with generics (both OO and functional, reified and erased)—in three months, no counterarguments to these claims were ever provided. And in my recent presentations I gave multiple reasons why any structural type system that would be rich enough to express common languages with generics would also necessarily be undecidable. As such, the section on parametric polymorphism in the Post-MVP should be removed as it cannot be reasonably expected to serve its intended purpose.

@conrad-watt
Copy link
Contributor

The post-MVP document is a collection of ideas which aren't expected to be finalized, and none of the text you're removing claims that the proposed extension would be sufficient to solve all of the source-level generic issues you've flagged up. Your points about the extension's inadequacies are worth understanding, but I don't see a reason to delete the text right now.

@RossTate
Copy link
Contributor Author

There is a big difference between "this has yet to be finalized" and "there is no evidence that this can serve its intended purpose, and there is substantial evidence that no variant/extension of this can serve its intended purpose". Considering this a viable extension, despite all evidence, keeps us from improving the MVP. For example, given that we cannot eliminate casts in this way, WebAssembly is on track to require more casts than any VM. Acknowledging that fact will enable us to focus the conversation on mitigating the costs of these casts, say by reducing their frequency (e.g. adding support for covariant arrays or building in method invocation) or reducing their costs (e.g. ref.cast_exact).

We will not make effective decisions if we continue to put so much faith in something that seems impossible to achieve.

@conrad-watt
Copy link
Contributor

Acknowledging that fact will enable us to focus the conversation on mitigating the costs of these casts, say by reducing their frequency (e.g. adding support for covariant arrays or building in method invocation)

I actually do think this is a realistic way forward, and your previously posted issues have been important in helping me form that opinion. I don't see this text as preventing us from thinking about the issues you've raised.

@RossTate
Copy link
Contributor Author

It is a realistic way forward, and more generally baking in paradigm-specific constructs would provide a path towards reducing casts for at least some languages. The problem is that I have been told WebAssembly is supposed to be language agnostic, and such solutions are definitively not language agnostic. The text I am proposing to delete provides an illusion that it's possible for the current MVP to be extended to eliminate casts in a language-agnostic manner. By removing that text, I am removing the illusion so that we start realistically considering the tradeoffs between language agnosticism and performance that the current MVP forces.

@titzer
Copy link
Contributor

titzer commented Jan 15, 2021

There are too many issues mixed up in #156, so I don't agree that it conclusively proved anything about what can and cannot be supported in a post-MVP world. Personally I find it unlikely that the only scope over which code in Wasm can/will need to be type-parameterized is a module. It seems likely we'll want to consider parameterizing functions and/or data structure declarations by types. As you mentioned casts, there is currently no complete story on how to eliminate large quantities of them in the future, and I don't see closing this path as any progress towards that.

@RossTate
Copy link
Contributor Author

Per everyone's comments, I updated the PR to leave the thoughts and just add the feasibility analysis.

@@ -773,7 +771,7 @@ There are a number of reasons to make RTTs explicit:

* It allows more choice in producers' use of RTT information, including making it optional (post-MVP), in accordance with the pay-as-you-go principle: for example, structs that are not involved in any casts do not need to pay the overhead of carrying runtime type information (depending on specifics of the GC implementation strategy). Some languages may never need to introduce any RTTs at all.

* Most importantly, making RTTs explicit separates the concerns of casting from Wasm-level polymorphism, i.e., [type parameters and fields](#type-paraemters-and-fields). Type parameters can thus be treated as purely a validation artifact with no bearing on runtime. This property, known as parametricity, drastically simplifies the implementation of such type parameterisation and avoids the substantial hidden costs of reified generics that would otherwise hvae to be paid for every single use of type parameters (short of non-trivial cross-procedural dataflow analysis in the engine).
* Most importantly, making RTTs explicit separates the concerns of casting from Wasm-level polymorphism, i.e., [type parameters and fields](#type-paraemters-and-fields). Type parameters can thus be treated as purely a validation artifact with no bearing on runtime. This property, known as erasure, drastically simplifies the implementation of such type parameterisation and avoids the substantial hidden costs of reified generics that would otherwise have to be paid for every single use of type parameters (short of non-trivial cross-procedural dataflow analysis in the engine).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* Most importantly, making RTTs explicit separates the concerns of casting from Wasm-level polymorphism, i.e., [type parameters and fields](#type-paraemters-and-fields). Type parameters can thus be treated as purely a validation artifact with no bearing on runtime. This property, known as erasure, drastically simplifies the implementation of such type parameterisation and avoids the substantial hidden costs of reified generics that would otherwise have to be paid for every single use of type parameters (short of non-trivial cross-procedural dataflow analysis in the engine).
* Most importantly, making RTTs explicit separates the concerns of casting from Wasm-level polymorphism, i.e., [type parameters and fields](#type-parameters-and-fields). Type parameters can thus be treated as purely a validation artifact with no bearing on runtime. This property, known as erasure, drastically simplifies the implementation of such type parameterisation and avoids the substantial hidden costs of reified generics that would otherwise have to be paid for every single use of type parameters (short of non-trivial cross-procedural dataflow analysis in the engine).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

...avoids the substantial hidden costs of reified generics that would otherwise have to be paid for every single use of type parameters...

To be clear, the costs are the same but they are no longer hidden, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They costs are the same when reification is necessary, but reification is not necessary for many polymorphic functions.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, apparently that link is not just typoed but also outdated: #181.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, erasure is not a property, it is a code transformation or implementation technique that is enabled by the parametricity property.

* Impredicative bounded polymorphism, such as bounded generic methods in C#, or the bounded existential types used to describe (at the assembly level) Java/C# covariant arrays or Kotlin `in`/`out` type projections.

Each of these features is undecidable on its own, and although the respective surface languages are decidable, their low-level encoding as structural types is not.
As such, it is unlikely that the MVP will be able to be extended to support languages with generics.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we qualify this statement a bit? Perhaps "... it is unlikely that the MVP will be able to be extended to support languages with generics via general extensions of the type system" or something like that. My understanding is that more specific solutions like built-in covariant arrays and method receivers could be found for many of these problems if necessary. (And if not, we should discuss and clarify that!)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, good point. I'll whip up a more qualified statement.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pushed a qualifying statement. Please let me know of it's an improvement.

Copy link
Member

@rossberg rossberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sigh. Another vast overstatement based on multiple dubious claims.

There is a whole range of well-known approaches to resolve the tension between recursive types and parameterisation of types:

  • The simple solution that works for equi-recursion is to require uniform recursion between types. Ocaml would be an obvious example of this approach, which applies it to both structural and nominal types.
  • Another solution is to model type recursion with iso-recursive types at constructor kind. That is e.g. the method used to model datatypes in Harper/Stone's formalisation of Standard ML and its module system, and in other module literature.
  • The ad-hoc solution is to impose limits on expansion depth. That's e.g. the C++ way, and probably the least elegant but most widely used solution.
  • Another obvious solution is to disallow type synonyms to be recursive altogether. That's e.g. the case in Scala, Rust, ML type synonyms, etc. If you have nominal types or iso recursion already, then this is a fine solution, too.

So, there are varous well-understood ways in which what's currently written in the doc could be fleshed out without running into any decidability problem. Nor does it exclude the addition of nominally recursive types. (And of course, as usual, it is to be expected that whatever typing rules we adopt, they won't be able to embed all possible source type systems without extra overhead.)

Consequently, I agree with previous comments that there is no need for change.

@@ -773,7 +771,7 @@ There are a number of reasons to make RTTs explicit:

* It allows more choice in producers' use of RTT information, including making it optional (post-MVP), in accordance with the pay-as-you-go principle: for example, structs that are not involved in any casts do not need to pay the overhead of carrying runtime type information (depending on specifics of the GC implementation strategy). Some languages may never need to introduce any RTTs at all.

* Most importantly, making RTTs explicit separates the concerns of casting from Wasm-level polymorphism, i.e., [type parameters and fields](#type-paraemters-and-fields). Type parameters can thus be treated as purely a validation artifact with no bearing on runtime. This property, known as parametricity, drastically simplifies the implementation of such type parameterisation and avoids the substantial hidden costs of reified generics that would otherwise hvae to be paid for every single use of type parameters (short of non-trivial cross-procedural dataflow analysis in the engine).
* Most importantly, making RTTs explicit separates the concerns of casting from Wasm-level polymorphism, i.e., [type parameters and fields](#type-paraemters-and-fields). Type parameters can thus be treated as purely a validation artifact with no bearing on runtime. This property, known as erasure, drastically simplifies the implementation of such type parameterisation and avoids the substantial hidden costs of reified generics that would otherwise have to be paid for every single use of type parameters (short of non-trivial cross-procedural dataflow analysis in the engine).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, erasure is not a property, it is a code transformation or implementation technique that is enabled by the parametricity property.

* Expansive-recursive structures, such as `datatype 'a T = EMPTY | NODE of 'a * (('a T) T)` in ML
* Impredicative bounded polymorphism, such as bounded generic methods in C#, or the bounded existential types used to describe (at the assembly level) Java/C# covariant arrays or Kotlin `in`/`out` type projections.

Each of these features is undecidable on its own, and although the respective surface languages are decidable, their low-level encoding as structural types is not.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another false statement. For example, Harper/Stone's formalisation of Standard ML proves the opposite, and does not involve anything particularly fancy.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yes, these are not undecidable on their own as the surface languages are decidable. It is their respective extensions to the MVP that would be undecidable. For example, unlike the MVP, Harper/Stone's formalization has no subtyping (at this level), which it relies on for decidability. I pushed a qualification of the statement.

@@ -403,6 +403,16 @@ However, there are a number of challenges:

In general, the semantics and implementation of type parameters should be analogous to that of type imports. Ideally, in the presence of the [module linking proposal](https://github.com/WebAssembly/module-linking), it should even be possible to explain definitions with type parameters as shorthands for nested modules (well, at least for 2nd-class cases).

### Fundamental Limitations

Unfortunately, languages with generics exhibit one or both of the following features:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As written, this is a false statement. Ocaml would be an obvious counter example, which requires uniform recursion for both structural and nominal types and has no bounded quantification.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OCaml appears to only restrict recursive polymorphic functions to be uniformly recursive. The following expansive-recursive type seems to compile in OCaml: type 'a t = EMPTY | NODE of 'a * (('a t) t).


Each of these features is undecidable on its own, and although the respective surface languages are decidable, their low-level encoding as structural types is not.
As such, it is unlikely that the MVP will be able to be extended to support cast-free implementations of languages with generics, though the MVP might be extended with built-in support for common special cases of generics such as covariant arrays.
Nonetheless, we might find other uses for parametric polymorphism, and the following offers an impression of what such an extension might look like.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be pedantic, the alleged problem does not even affect polymorphism (value-level generics), it only involves the semantics of parameterised type constructors. Polymorphism as such is fine, useful, and hugely relevant regardless of how type definitions or type recursion work.

@RossTate
Copy link
Contributor Author

@rossberg I suspect you misunderstand the issues and the evidence supporting them as your response does not address the issues nor the evidence.

First, every suggestion you offered seems to be in response to only expansive recursion, which is only one of the issues I raised.

Second, your suggestions seem to be countering a claim I did not make above. They seem to be countering the unmade claim "The features in the Post-MVP cannot be made decidable", whereas the actual claim was "any structural type system ... rich enough to express common languages with generics would also necessarily be undecidable". In each of your suggestions, you attempt to make the system decidable by restricting its expressiveness, but you do not consider the safe programs that would be rejected due to these restrictions.

Third, I showed how even a very limited language would need undecidable expansive-recursive subtyping to describe the structure of its object representation such that it could guarantee all operations are safe without casting, offering evidence to back the claim I made above. You did not respond to the example or the larger issue. (Note that this is not the only such evidence I offered—it's the evidence I offered with respect to expansive recursion.)

Now I'll go through each of the suggestions in more detail:

The simple solution that works for equi-recursion is to require uniform recursion between types. Ocaml would be an obvious example of this approach, which applies it to both structural and nominal types.

To give everyone context, with type parameters and type recursion you can define the type construct List E in two ways: fix T : * -> *. fun (E : *) => 1 + (E * (T E)) or fun (E : *) => fix T : *. 1 + (E * T). The former takes a fixpoint on a function on types, whereas the latter binds a type parameter that is then used to define a fixpoint on types. Uniform type recursion restricts to the latter case—you are not allowed to take fixpoints on functions on types (i.e. no fix T : * -> *).

This restriction makes the given expansive-recursive type type 'a t = EMPTY | NODE of 'a * (('a t) t) inexpressible, so the described solution here is to limit the code/languages expressible in WebAssembly. (Note, this type does compile in OCaml—it's just the flatten function that doesn't due to lack of polymorphic recursion, but other functional languages that support polymorphic recursion or higher-kinded polymorphism can express flatten.)

Another example of non-uniformly recursive types in practice in Java's (extremely important) Stream library, where a Stream<T> has methods such as <R> Stream<R> flatMap(Function<? super T,? extends Stream<? extends R>> mapper). Of course, this erases, for now...

An unerased example comes from C#'s precursor to Java's Stream library: LINQ. LINQ's performance relies on reflecting on the query expression, and its Expression class requires expansive recursion: it has a method Accept(ExpressionVisitor), and ExpressionVisitor has a method VisitLambda<T>(Expression<T>) where Expression<T> extends Expression.

Another solution is to model type recursion with iso-recursive types at constructor kind. That is e.g. the method used to model datatypes in Harper/Stone's formalisation of Standard ML and its module system, and in other module literature.

Harper/Stone's formalization has no subtyping (at this level), so such appeals are not informative. There are, however, systems with iso-recursive subtyping. Unfortunately, the combination of iso-recursion and subtyping is complicated and has such has many variations. But the weakest variation that can express the inheritance relationships between generics is undecidable (the reduction I gave in my presentation applies to such iso-recursive types). So using iso-recursive types likely either makes the type system impractically inexpressive or still exhibits the existing undecidability properties.

The ad-hoc solution is to impose limits on expansion depth. That's e.g. the C++ way, and probably the least elegant but most widely used solution.

I'm not entirely sure what you mean here, but all my guesses result in rejecting the above programs, which does not address the issue.

Another obvious solution is to disallow type synonyms to be recursive altogether. That's e.g. the case in Scala, Rust, ML type synonyms, etc. If you have nominal types or iso recursion already, then this is a fine solution, too.

Type synonymous are layered on top of something else. The issue is what is that something else and is it decidable. Many languages that have no type recursion still have recursive structural types. So I get the impression that this is an attempt to handwave around the issue.

@RossTate
Copy link
Contributor Author

@rossberg Do you still have unaddressed concerns with the feasibility analysis? It would be useful to resolve this so that we can start planning accordingly. (For example, this affects the tradeoffs in the options being considered in WebAssembly/function-references#44.)

@rossberg
Copy link
Member

Yes, this PR still has all the issues. Its overall thrust is highly non-constructive and the quintessential argument is a non-sequitur -- again, one feature not being expressive enough for some future optimisations does not preclude the future addition of features to cover these. There is a completely unfounded either-or premise underlying the PR.

(I considered replying to your earlier comment in detail, but honestly I don't think it would get us anywhere.)

@RossTate
Copy link
Contributor Author

If you do not find this to be a constructive path for engaging in critical discussion of the issues, then please let me know of a path that would be. I feel like I have tried many others to no avail. Most recently I gave presentations to illustrate the issues in more depth, including examples specifically chosen to demonstrate why the decidability results in module theory I suspected you were relying on (as you just did above) would not transfer to this setting. You chose not to attend those presentations, nor even to indicate that I should move their dates if I wanted you to attend.

When you describe your concerns with nominal types, I follow up with requests for examples to illustrate those points and read the papers you reference so that I can better understand your concerns, and then I offer illustrations of how those concerns can be addressed and repeat until all your examples seem to be covered. What should I do for you to do the same with my concerns?

@RossTate
Copy link
Contributor Author

RossTate commented Feb 9, 2021

@rossberg Do you have a suggestion?

@rossberg
Copy link
Member

For now, I suggest closing the PR. ;) The proposed text is misleading and serves no constructive purpose that I can tell.

@RossTate
Copy link
Contributor Author

You don't seem to have a constructive suggestion to offer. Until you demonstrate otherwise, I will proceed with the understanding that there is no way to get you to productively engage in any issue that suggests fundamental flaws in your design, which seems consistent with your conduct so far.

@titzer
Copy link
Contributor

titzer commented Mar 24, 2022

Re-reading this discussion, I do not see sufficient consensus for this PR to land, so I'll close this.

If we want to revisit the language used to describe potential directions for post-MVP parametric polymorphism, I think we'll need a reset on this conversation.

@titzer titzer closed this Mar 24, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants