Skip to content
This repository was archived by the owner on Apr 25, 2025. It is now read-only.

Casting null #152

Closed
titzer opened this issue Oct 12, 2020 · 6 comments
Closed

Casting null #152

titzer opened this issue Oct 12, 2020 · 6 comments

Comments

@titzer
Copy link
Contributor

titzer commented Oct 12, 2020

Currently, ref.cast is spec'd to trap on null, and always return a non-null reference. For some languages, e.g. Java, casts of a null succeed. Thus to implement this behavior, user code must insert a branch checking for null first. To avoid making that check for null redundant with the null check in ref.cast, br_on_null is a natural fit, since it refines the type to be non-null in the fallthrough. But br_on_null pops the null value off of the operand stack, so we end up with a sequence like:

(block $l1 (ref $t2))
  (block $l2)
    (br_on_null $l2 x)
    (ref_cast $t1 $t2)  // <-- benefits from knowing x non-null
    (br $l1)
  (end)
  (ref_null $t2)
(end)

Or another option:

(ref_is_null x)
(if (ref $t2))
  (ref_null $t2)
 (else)
  (ref_cast $t1 $t2 x) // <-- x not known to be non-null here
(end)

(note the above are simplified, omitting the RTT values)

Instead both of these sequences could be simplified if we change the semantics of ref_cast. For example, if ref_cast has a type immediate that is allowed to be a nullable type, then the nullability could be used to indicate whether null is allowed to succeed, or should trap.

@skuzmich
Copy link

Bumped into this in Kotlin compiler. It would be real nice to have a version of ref_cast that doesn't trap on null.

Also, in second example above, value of x is used twice, requiring additional local.tee and local.get.

@Horcrux7
Copy link

I does not see any sense that I can't cast a NULL value to another nullable value. This is also not consistence because a null value required an type for ref.null.

@titzer
Copy link
Contributor Author

titzer commented Nov 15, 2020

The use case I care most about is, e.g. downcasting from (the wasm representation of) java/lang/Object to java/lang/String. In Java, this cast will succeed if the input value is null. I am not as interested in casting across incompatible class hierarchies, where the only value that would succeed would be null (e.g. trying to cast from java/lang/Class to java/lang/String). So I think it's reasonable that code validation check the target type is a subtype of the input type.

@rossberg
Copy link
Member

Hm, I see the reasoning, but there is a trade-off. It would change the typing from

ref.cast : [(ref null t1) (rtt $t2)] -> [(ref $t2)]

to

ref.cast : [(ref nl t1) (rtt $t2)] -> [(ref nl $t2)]  (where nl in {null, epsilon})

This would be a potential extra cost in contexts where the producer knows out of band that the argument can't be null. It would be forced to insert an additional ref_as_non_null, i.e., an extra null check, in some cases. But that's probably fine and easy to optimize in the engine in most cases? Or is it worth having two versions of the instruction?

Another, more theoretical problem I have is about coherent semantics. We unfortunately were forced to remove the nullref type and introduce type-indexed null values, due to WebAssembly/reference-types#87. I always feared that would bite us, and it does here. Consider:

(local $r (ref null $t))
(local.set $r (ref.null $t1))        ;; $t1 <: $t
(ref.cast (local.get $r) (rtt $t2))  ;; $t2 <: $t

Even in cases where $t1 <: $t2 does not hold, this would have to succeed and return (ref.null $t2), because engines have no way of checking that this is an "incompatible" null value (or they were forced to implement runtime type information on null values). So the spec would have to allow casting null between incompatible sibling types. Ugly. We can pave over that in a number of ways, e.g., most naturally by removing the type index on null values, as opposed to instructions. But then we lose principal types, due to the lack of a nullref type, unless we reintroduce that. (I expect that very few people care about issues like this, but they are first signs of cancer of a type system. Of course, it's all solvable with hacks, but it may cause cascading issues down the road.)

@jakobkummerow
Copy link
Contributor

Or is it worth having two versions of the instruction?

Seems reasonable, and matches what I've heard from toolchain authors: both null-permitting and null-rejecting casts are useful.

@rossberg
Copy link
Member

Closing via #161.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants
@Horcrux7 @skuzmich @rossberg @jakobkummerow @titzer and others