Implement bottom heap types #5115

tlively · 2022-10-05T02:53:03Z

These types, none, nofunc, and noextern are uninhabited, so references to
them can only possibly be null. To simplify the IR and increase type precision,
introduce new invariants that all ref.null instructions must be typed with one
of these new bottom types and that Literals have a bottom type iff they
represent null values. These new invariants requires several additional changes.

First, it is now possible that the ref or target child of a StructGet,
StructSet, ArrayGet, ArraySet, or CallRef instruction has a bottom
reference type, so it is not possible to determine what heap type annotation to
emit in the binary or text formats. (The bottom types are not valid type
annotations since they do not have indices in the type section.)

To fix that problem, update the printer and binary emitter to emit unreachables
instead of the instruction with undetermined type annotation. This is a valid
transformation because the only possible value that could flow into those
instructions in that case is null, and all of those instructions trap on nulls.

That fix uncovered a latent bug in the binary parser in which new unreachables
within unreachable code were handled incorrectly. This bug was not previously
found by the fuzzer because we generally stop emitting code once we encounter an
instruction with type unreachable. Now, however, it is possible to emit an
unreachable for instructions that do not have type unreachable (but are
known to trap at runtime), so we will continue emitting code. See the new
test/lit/parse-double-unreachable.wast for details.

Update other miscellaneous code that creates RefNull expressions and null
Literals to maintain the new invariants as well.

tlively · 2022-10-05T02:53:15Z

Current dependencies on/for this PR:

main
- PR Simplify and fix heap type counting #5110
  - PR Pin v8 10.8.104 #5112
    - PR Implement bottom heap types #5115 👈

This comment was auto-generated by Graphite.

These types, `none`, `nofunc`, and `noextern` are uninhabited, so references to them can only possibly be null. To simplify the IR and increase type precision, introduce new invariants that all `ref.null` instructions must be typed with one of these new bottom types and that `Literals` have a bottom type iff they represent null values. These new invariants requires several additional changes. First, it is now possible that the `ref` or `target` child of a `StructGet`, `StructSet`, `ArrayGet`, `ArraySet`, or `CallRef` instruction has a bottom reference type, so it is not possible to determine what heap type annotation to emit in the binary or text formats. (The bottom types are not valid type annotations since they do not have indices in the type section.) To fix that problem, update the printer and binary emitter to emit unreachables instead of the instruction with undetermined type annotation. This is a valid transformation because the only possible value that could flow into those instructions in that case is null, and all of those instructions trap on nulls. That fix uncovered a latent bug in the binary parser in which new unreachables within unreachable code were handled incorrectly. This bug was not previously found by the fuzzer because we generally stop emitting code once we encounter an instruction with type `unreachable`. Now, however, it is possible to emit an `unreachable` for instructions that do not have type `unreachable` (but are known to trap at runtime), so we will continue emitting code. See the new test/lit/parse-double-unreachable.wast for details. Update other miscellaneous code that creates `RefNull` expressions and null `Literals` to maintain the new invariants as well.

kripken

Nice!

Comments on the code so far.

Please update the changelog.

kripken · 2022-10-05T16:11:38Z

src/ir/effects.h

@@ -701,6 +701,10 @@ class EffectAnalyzer {
      }
    }
    void visitCallRef(CallRef* curr) {
+      if (curr->target->type.isNull()) {
+        parent.trap = true;


This is not necessary - we already set implicitTrap below. That could be reordered to the top, and then the early return would avoid other stuff if we need to avoid that.

I guess this does add more information in a sense, but optimizations should do this anyhow, I think?

Another concern I have here is how this interacts with trapsNeverHappen. We never set .trap directly so that we can gate on that flag.

Are you thinking that we should be setting implicitTrap when the target is null rather than setting trap? I think it's more consistent to set trap, just like visitUnreachable does. In particular, the CallRef will be emitted as (unreachable) in this case, so it would be strange if its effects were different from Unreachable's effects.

You're probably right that optimizations would take care of this anyway, but the code seems more correct to me as-is.

I was thinking we can set implicitTrap when the target is nullable, which would include null. You're right that the current code is more precise in a way, but I don't think we gain from the extra verbosity.

I guess it's fine, though, if we make sure that it doesn't limit us. Specifically, with trapsNeverHappen and the new code then a callRef of a null would no longer be removed by vacuum (like it doesn't remove an unreachable). We should have a test that optimize-instructions turns the callRef into an unreachable (which dce would then remove). (There might already be a test, sorry if so - I haven't gotten to those yet.)

It seems odd that trapsNeverHappen doesn't remove an unreachable, but I guess that's a separate topic.

Will make sure I add specific tests for the OptimizeInstructions changes.

kripken · 2022-10-05T16:15:54Z

src/literal.h

-    }
-    return false;
-  }
+  bool isNull() const { return type.isRef() && type.getHeapType().isBottom(); }


Suggested change

bool isNull() const { return type.isRef() && type.getHeapType().isBottom(); }

bool isNull() const { return type.isNull(); }

kripken · 2022-10-05T16:17:10Z

src/passes/Inlining.cpp

@@ -278,17 +283,26 @@ struct Updater : public PostWalker<Updater> {
  }
  void visitCall(Call* curr) {
    if (curr->isReturn) {
-      handleReturnCall(curr, module->getFunction(curr->target)->type);
+      auto* func = module->getFunction(curr->target);
+      handleReturnCall(curr, func->type.getSignature().results);


Suggested change

handleReturnCall(curr, func->type.getSignature().results);

handleReturnCall(curr, func->getResults());

src/wasm-builder.h

kripken · 2022-10-05T16:26:10Z

src/wasm-builder.h

@@ -1262,7 +1266,7 @@ class Builder {
    if (curr->type.isTuple() && curr->type.isDefaultable()) {
      return makeConstantExpression(Literal::makeZeros(curr->type));
    }
-    if (curr->type.isNullable()) {
+    if (curr->type.isNullable() && curr->type.getHeapType().isBottom()) {


Suggested change

if (curr->type.isNullable() && curr->type.getHeapType().isBottom()) {

if (curr->type.isNull()) {

src/wasm/wasm-binary.cpp

kripken · 2022-10-05T16:43:13Z

src/wasm/wasm-validator.cpp

@@ -2079,6 +2079,10 @@ void FunctionValidator::visitSelect(Select* curr) {
 }

 void FunctionValidator::visitDrop(Drop* curr) {
+  if (!curr->value->type.isConcrete() &&
+      curr->value->type != Type::unreachable) {
+    assert(false);


shouldBeFalse?

Looks like this is left over from some debugging. Will remove.

kripken · 2022-10-05T16:44:07Z

src/wasm/wasm-validator.cpp

+  }
+  shouldBeTrue(curr->type.getHeapType().isBottom(),
+               curr,
+               "ref.null must have a bottom heap type");


Can also check type.isNull(), though maybe that just needs to be an assert?

Sure, since we already checked that it's nullable, this can be isNull().

src/wasm/wasm.cpp

kripken

Some comments after perhaps half the tests.

test/lit/passes/remove-unused-module-elements-refs.wast

kripken · 2022-10-05T22:22:14Z

test/lit/passes/optimize-instructions-gc-tnh.wast

-  ;; TNH-NEXT:    (ref.as_data
-  ;; TNH-NEXT:     (ref.null any)
-  ;; TNH-NEXT:    )
-  ;; TNH-NEXT:   )


I guess the additional type info about nulls makes it optimizable now?

Comments on line 120-121 looks stale now though.

Yeah, the new types on nulls help. This whole second case seems stale, so I'll delete it.

kripken · 2022-10-05T22:24:03Z

test/lit/passes/merge-blocks.wast

+ ;; CHECK-NEXT:      (i32.const 1234)
+ ;; CHECK-NEXT:     )
+ ;; CHECK-NEXT:     (ref.null none)
+ ;; CHECK-NEXT:    )


This is no longer being optimized properly. MergeBlocks should have moved the nested drop to the outer block. Perhaps the pass is confused by the type of the null?

It looks like this is due to this code: https://github.com/WebAssembly/binaryen/blob/main/src/passes/MergeBlocks.cpp#L580-L589

I'll fix it by adding a parameter.

kripken · 2022-10-05T23:58:40Z

test/lit/passes/gufa.wast

@@ -1046,7 +1046,7 @@

  ;; CHECK:      (func $target-keep (param $x i32)
  ;; CHECK-NEXT:  (drop
-  ;; CHECK-NEXT:   (i32.const 1)
+  ;; CHECK-NEXT:   (unreachable)


This change shows the test isn't doing what it should any more. I think you need to replace the ref.null with something like a cast of a null to type $A, so that it sees $A there (but we must not use ref.func of this function).

(or local.get as I see you've been doing)

Pushed a fix for this one just now. (Possibly masking a determinism bug, since CI was getting a different output for this module than I was locally. Hopefully that will come back up in fuzzing.)

kripken · 2022-10-06T00:03:52Z

test/lit/passes/dae-gc-refine-params.wast

 ;; CHECK-NEXT:  )
 ;; CHECK-NEXT: )
 ;; NOMNL:      (func $various-params-set (type $ref?|${i32}|_ref?|${i32}|_=>_none) (param $x (ref null ${i32})) (param $y (ref null ${i32}))
- ;; NOMNL-NEXT:  (local $2 (ref null ${}))
- ;; NOMNL-NEXT:  (local.set $2
+ ;; NOMNL-NEXT:  (drop


See comment on new line 264, it is important to see a new local appear here for this test, so I guess we need a value other than null so subtyping is a problem (that forces the local to be added).

aheejin · 2022-10-06T09:24:18Z

I think I don't have the context leading to this PR.

What is the purpose of having the bottom heap types?
Are noextern or none the standard wat syntax of the GC proposal? Are they gonna be parsed elsewhere as well?

tlively · 2022-10-06T15:18:39Z

I think I don't have the context leading to this PR.

What is the purpose of having the bottom heap types?

The primary benefit is that module producers no longer need to come up with a specific type to annotate their ref.null instructions with. Instead, they can always do ref.null none (or ref.null nofunc for function references). This is useful because source languages usually do not have typed nulls, so frontends used to have to do some analysis to figure out what type to use for the annotation.

It's also useful for us because it gives us more specialized types for things that only ever return null values, and we can use that type information to optimize better.

Are noextern or none the standard wat syntax of the GC proposal? Are they gonna be parsed elsewhere as well?

Yes, they were introduced in the upstream spec in this PR: https://github.com/WebAssembly/gc/pull/310/files#diff-1df1ea4af362cad5073bac12a65046e4c21bdaa644f69e203b41687b933ac91dR44-R54

kripken · 2022-10-06T16:29:20Z

test/lit/passes/cfp.wast

  ;; CHECK-NEXT:    )
-  ;; CHECK-NEXT:    (f32.const 42)
+  ;; CHECK-NEXT:    (unreachable)


This looks wrong.

kripken · 2022-10-06T16:40:23Z

test/lit/parse-double-unreachable.wast

+   )
+  )
+
+  ;; Now we are in unreachable parsing mode. Due to the bug, we would have


sentence cut off?

kripken · 2022-10-06T16:43:49Z

test/lit/heap-types.wast

  (type $struct.A (struct i32))
  ;; NOMNL:      (type $struct.B (struct_subtype (field i32) data))
  (type $struct.B (struct i32))
  ;; CHECK:      (func $test
  ;; CHECK-NEXT:  (drop
  ;; CHECK-NEXT:   (ref.test_static $struct.A
-  ;; CHECK-NEXT:    (ref.null $struct.A)
+  ;; CHECK-NEXT:    (ref.null none)


The binary outputs for this test changed a lot, I think because the null now makes everything after it unreachable and not emitted. Might be better to make this read a local.

test/example/typeinfo.txt

kripken · 2022-10-06T16:47:10Z

test/example/cpp-unit.cpp

-    assert_equal(effects.writesStruct, false);
-  }
-}
-


why remove the above and the below?

This test didn't initialize the ArrayCopy's children, so it ended up reading junk data. Rather than fix it by building a complete expression here, it seemed simpler to delete it. This is not a comprehensive test of the effects API, so deleting this one test didn't seem too bad.

Similarly, I deleted the test below because it's output changed but it didn't seem like it was pulling its weight.

Good point about junk data. But it's likely the only place we test this, so I think it's worth fixing. I can do that in a tiny initial PR if you want. (I mean, just to create some kind of valid data for children - doesn't need to pass the validator even, but just be scannable.)

The test below I do agree is not worthwhile since lots of tests print i31s.

It's ok, I can fix it.

tlively · 2022-10-07T00:59:13Z

I believe all the feedback so far has been addressed.

kripken

Great! Code and tests lgtm. Please fuzz this carefully before landing though.

kripken · 2022-10-07T02:34:07Z

src/wasm-builder.h

@@ -1262,7 +1266,7 @@ class Builder {
    if (curr->type.isTuple() && curr->type.isDefaultable()) {
      return makeConstantExpression(Literal::makeZeros(curr->type));
    }
-    if (curr->type.isNullable()) {
+    if (curr->type.isNullable() && curr->type.isNull()) {


Why can't we just have the 2nd condition? (How can this have a null type, but that type is not nullable?)

For example this could be a Call of a function that returns (ref none). At runtime that call will never finish, but it is still a valid instruction with type (ref none). In that case we couldn't replace it with a ref.null because that would require changing the type to (ref null none).

Interesting...

So (ref none) is kind of like an unreachable type in wasm, that can actually be used as a function's return type..? Perhaps we could use that.

Yeah exactly. Although there are three of them ((ref none), (ref nofunc), and (ref noextern)), they don't actually put validation into unreachable mode, and the rest of the code is already set up to handle Type::unreachable correctly, so I think the best way to use them would be to translate them to unreachable or place an (unreachable) after any instruction that produces one.

tlively · 2022-10-07T13:01:24Z

Fuzzed 44k iterations overnight with no problems.

tlively requested a review from kripken October 5, 2022 02:53

This was referenced Oct 5, 2022

Pin v8 10.8.104 #5112

Merged

Simplify and fix heap type counting #5110

Merged

tlively requested a review from aheejin October 5, 2022 02:53

Base automatically changed from pin-v8-10.8.104 to main October 5, 2022 15:13

tlively added 3 commits October 5, 2022 08:57

nullopt

b5b453a

binaryen.js tests

d6c1a62

tlively force-pushed the bottom-types branch from b6c48c5 to d6c1a62 Compare October 5, 2022 16:31

kripken reviewed Oct 5, 2022

View reviewed changes

tlively added 4 commits October 5, 2022 12:32

Address feedback

05ecd5f

simplify some code

09d42f1

Add tests

328a068

Changelog

ddfebfa

kripken reviewed Oct 5, 2022

View reviewed changes

Replace ref.nulls with locals to avoid unreachable replacements in tests

5b556ca

kripken reviewed Oct 6, 2022

View reviewed changes

update gufa test

63e853e

kripken reviewed Oct 6, 2022

View reviewed changes

fix remove-unused-module-elements-refs test

6c33c30

kripken reviewed Oct 6, 2022

View reviewed changes

tlively added 6 commits October 6, 2022 10:18

remove stale optimize-instructions-tnh test

be4a1ff

fix merge-blocks test

f8c987b

fix dae-gc-refine-params.wast

d4fd05c

Remove unfinished, stale comment

2310c41

fix test/heap-types.wast

93d3d7d

fix test/lit/passes/cfp.wast

d6a1fda

Fix possible-contents handling of nulls

0d68f7d

tlively mentioned this pull request Oct 6, 2022

[Wasm GC] [GUFA] Add initial ConeType support #5116

Merged

tlively requested a review from kripken October 7, 2022 00:59

restore and fix ArrayCopy effects test

6c4933e

kripken approved these changes Oct 7, 2022

View reviewed changes

tlively merged commit 7fc26f3 into main Oct 7, 2022

tlively deleted the bottom-types branch October 7, 2022 13:02

MaxGraey mentioned this pull request Oct 8, 2022

Update Binaryen AssemblyScript/assemblyscript#2531

Closed

2 tasks

	bool isNull() const { return type.isRef() && type.getHeapType().isBottom(); }
	bool isNull() const { return type.isNull(); }

	handleReturnCall(curr, func->type.getSignature().results);
	handleReturnCall(curr, func->getResults());

	if (curr->type.isNullable() && curr->type.getHeapType().isBottom()) {
	if (curr->type.isNull()) {

Implement bottom heap types #5115

Implement bottom heap types #5115

Uh oh!

Conversation

tlively commented Oct 5, 2022

Uh oh!

tlively commented Oct 5, 2022

Uh oh!

kripken left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

kripken left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

aheejin commented Oct 6, 2022

Uh oh!

tlively commented Oct 6, 2022

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

tlively commented Oct 7, 2022

Uh oh!

kripken left a comment

Choose a reason for hiding this comment