Skip to content

Pattern to test if an object is a record type and then destructure it #2218

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
munificent opened this issue Apr 28, 2022 · 6 comments
Closed
Labels
patterns Issues related to pattern matching.

Comments

@munificent
Copy link
Member

TL;DR: Since record patterns are purely structural and work on any type, there's no built-in pattern syntax that lets you actually test to see if a value is some record type. Should we have that?

The patterns proposal includes a record pattern syntax that looks like this:

MapEntry m = MapEntry('key', 123);
var (key: k, value: v) = m;
print('$k -> $v');

This prints key -> 123. This works with both instances of user-defined classes like the above example as well as record values:

var m = (key: 'key', value: 123);
var (key: k, value: v) = m;
print('$k -> $v');

This also prints key -> 123. The way it is able to handle both records and user-defined types is that its semantics are purely structural. It simply calls the corresponding getters on the matched object. It is a compile-time error if the static type of the object doesn't define a getter with that name.

This is nice because it lets this very pleasant syntax generalize across a wide range of objects. However, it does leave one hole. Consider:

// Prints the key of [obj], which can be a MapEntry or a record with a field
// named key.
void printKey(Object obj) {
  switch (obj) {
    case MapEntry(key: k): print(k);
    case (key: k): print(k);
    case _: throw ArgumentError('Not a MapEntry or record.');
  }
}

This does not work. The second case here doesn't mean "if obj is a record with field key then look up the field". It only means "access the key getter on obj", and since Object doesn't define a getter named key, that's a compile time error.

In the current proposal, there is no pattern to match a specific record type and then destructure it. You can match a record type using a variable pattern with a type annotation:

void printKey(Object obj) {
  switch (obj) {
    case MapEntry(key: k): print(k);
    case {key: String} record: print(record.key);
    case _: throw ArgumentError('Not a MapEntry or record.');
  }
}

This works. The {key: String} record pattern is a variable pattern annotated with type {key: String}, which is a record with a single named field key. But you can't check its type and then destructure it. There's no "extractor pattern" for record types. This is because extractor patterns require the type to be named. You could do:

typedef KeyRecord = {key: String};

void printKey(Object obj) {
  switch (obj) {
    case MapEntry(key: k): print(k);
    case KeyRecord(key: k): print(k);
    case _: throw ArgumentError('Not a MapEntry or record.');
  }
}

That works. But it seems strange to have to define a typedef for any record type you want to match and destructure. Match and destructuring actual records seems like it a should be a bread and butter feature of any pattern matching system.

I can think of a few solutions:

Keep it as-is

If you need to test if a value is some specific record type, do one of the above workarounds.

This sounds bad, but I suspect that most of the time when matching on records, you do know that it is a record with a specific shape already and all you are doing is destructuring it. In those cases, the current behavior works fine:

switch ((x: 1, y: 2)) {
  case (x: 0, y: 0): print('origin');
  case (x: x, y: 0): print('x axis');
  case (x: 0, y: y): print('y axis');
  case (x: x, y: y): print('somewhere');
}

The above code works fine with the current proposal. It's only when you are matching on a value whose static type is some supertype of records (like dynamic, Object, Object?, or a nullable record type) that you may want to first ask "is this actually a record?" in the pattern. And, as long as you aren't also destructuring, you can always just use a variable pattern to do that.

Use curly braces for getter destructuring

We could use different syntax for record value matching and getter destructuring. The obvious syntax for matching actual record values mirrors the record expression syntax, which is currently used for getter destructuring. It would probably be weird if the syntax for record matching didn't look like a record and didn't look like a record pattern in other languages. That implies using (a: a, b: b) for records and coming up with something different for structural destructuring.

One potential option is curly braces:

var {key: k, value: v} = MapEntry('key', 123);

I think that looks OK. We would likely want to use the same braces for extractor patterns since those are structural too. (A structural getter pattern without a type is basically an extractor on the same type.) Like:

void printKey(Object obj) {
  switch (obj) {
    case MapEntry {key: k}: print(k);
    case (key: k): print(k);
    case _: throw ArgumentError('Not a MapEntry or record.');
  }
}

I can't say I love that. One of my goals with this feature is to allow familiar algebraic datatype style patterns in Dart with a minimum of ceremony and a maximum of consistency with other languages. This might look strange to users coming from Scala/Rust/Swift:

switch (option) {
  case None {}: print('no value');
  case Some {value: var v}: print('value $v');
}

On the other hand, using Name { ... } for getter-based extractor patterns does free up the Name(...) syntax for use for user-defined custom extractor protocols. That would let you write code like:

switch (option) {
  case None(): print('no value');
  case Some(var v): print('value $v');
}

Provided that None and Some define the protocol to let them be matched that way.

The bigger problem is that this collides with map patterns, which also naturally want braces to mirror map literals. :(

Change the meaning based on the static type of the matched value

We could say that if the value being matched is a supertype of the record types, then a record pattern first also means "check that it's a record with this shape". But in other contexts, it is purely structural.

I'm not in favor of this because I think it's subtle and brittle under refactoring. I've tried very hard to make the pattern syntax not be dependent on the type context to know what it means.

I'm somewhat leaning towards keeping the proposal as it is. I believe it does optimize by giving the most common cases the nicest syntax. And in the rare cases where you do want to test if an object is a record, there are a few workarounds to accomplish it. But I'm not thrilled with the current design either.

@munificent munificent added the patterns Issues related to pattern matching. label Apr 28, 2022
@lrhn
Copy link
Member

lrhn commented Apr 28, 2022

I'm generally more positive about letting static type matter. It's actually how I had understood the current proposal.

My take, taken to its extreme, is:

The syntax for property extraction is an extension of the syntax for type matching, so TypeName(propertyMatches) or TypeName<TypeArgPatterns>(propertyMatches). There is no property matching without a type match, no purely structural match, but the type can be inferred instead of being explicit.

Seeing (propertyMatches) by itself, we have to infer a type.
If the expression being matched has static type S, then

  • If S is a record type, then that record type is inferred as the type of the property match.
  • If S is not a supertype of any record type (not Object or a supertype of Object or FutureOr<R> or R? where R is record type or a supertype of a record type), then we infer S as the type of the property match.
  • (If S is R? or FutureOr<R> and R is a record type, we may want to infer R, but only because the union types have no useful properties. Might not generalize if we add union types in general).
  • Otherwise (Object or a supertype of Object, which means the value might be a record, and we have no clue which structure that record might have), we infer the record type corresponding to the structure of (propertyMatches), which has exactly the positional and named members being matched, and with the type of each element being inferred from the matching patterns.

That means:

var p = (length: 2);  // inferred type (length: int)
var l = [1, 2]; // List<int>
var pq = p as (length: int)?;
var fop = p as FutureOr<(length: int)>;
var d = p as dynamic;

var (length: l1) = p; // Inferred (length: int)(length: l1), even if we can't syntactically write that.
var (length: l2) = l; // Inferred List<int>(length: l2)
var (length: l3) = pq; // compile-time error, implicit downcast.
var (length: l5) = d; // Inferred  (length: dynamic)(length: l5) = d as (length: dynamic);

switch (p) {
  case (length: var x): // inferred: (length: int)(length: var x)
}
switch (l) {
  case (length: var x): // inferred: List<int>(length: var x)
}
switch (pq) { // or fop
  case (length: var x): // infered (length: int)(length: var x)   (guessing you want to match the record type below the `?`.
}
switch (d) {
  case (length: var x): // Inferred (length: int)(length: var x)   from the structure.
}

Now, this all worked because it was the same type.
However, if we have

var p2 = (length: 2, other: "no");  // inferred type (length: int, other: string)
var o2 = p2 as dynamic;

var (length: x) = o2; // Inferred `var (length: dynamic)(length: x) = o2 as (length: dynamic);  // downcast fails

case (o2) {
  case (length: var x) : // inferred (length: dynamic)(length: var x) which fails to match (length: int, other: String).
}

That's because there is no structural match, ever. Not even on dynamic (it gets an implicit downcast in assignments, but that's it - there is nothing you can do with dynamic that you couldn't do by downcasting the value to its actual type.)

The idea here is the same as the inference of variable declarations today: If you don't write a type, one is inferred from type of the initializer expression. Here we just have a fallback in case the initializer expression is also not useful.

(This also hinges on my wish to make record types highly optimizable. I don't want to ever be able to access a record member without knowing the structure of the record it's being accessed on. That means that var (foo:, bar:) = dynamic; cannot match a record of any type that has a foo and bar named member, it needs to know which record type it's doing the access on, which is why it must be inferred to mean var (foo:, bar:) = dynamic as (foo: dynamic, bar: dynamic);, which fails if the value is not a record or it has more members.)

@munificent
Copy link
Member Author

The syntax for property extraction is an extension of the syntax for type matching, so TypeName(propertyMatches) or TypeName<TypeArgPatterns>(propertyMatches). There is no property matching without a type match, no purely structural match, but the type can be inferred instead of being explicit.

That's an interpretation, but I don't think it's the simplest one. My mental model is:

  1. A pattern like (subpatterns...) is simply a series of unrelated getter calls on the matched value, one for each subpattern. You can think of it almost like a "pattern cascade" where each field subpattern is just an independent operation on the matched value to call a getter and then match against the result.
  2. In order for the above to be sound, the static type of the matched value must define a getter corresponding to each field. Note that the (subpatterns...) pattern doesn't have any kind of structural type. We just use the type of the matched value.
  3. An extractor pattern like Foo(subpatterns...) is to two separate operations: Check that the value is a Foo. If so, go to 1 with that as the matched value's type. Else fail. If we had and patterns that let you apply two separate patterns to the same value, then an extractor pattern would essentially just be sugar for Foo _ and (subpatterns...).
  • Otherwise (Object or a supertype of Object, which means the value might be a record, and we have no clue which structure that record might have), we infer the record type corresponding to the structure of (propertyMatches), which has exactly the positional and named members being matched, and with the type of each element being inferred from the matching patterns.

This step really feels like a stretch to me. It means if you change the type of the matched value, you may change the behavior of a record pattern. Worse, this is a silent change that could simply cause cases that used to match to stop matching. In practice, if you don't have a default case, then exhaustiveness checker will probably tell you something fishy is going on. But it still feels more subtle than I'd like the feature to be.

@lrhn
Copy link
Member

lrhn commented Apr 28, 2022

That's a valid and consistent model too.

The checked object must have a type. If the pattern contains a type Foo(...), then the object is cast to Foo before testing the properties. If not, the type of the matched object is used as-is.

As you then point out, that leaves us without a good way to write a Record type in front (or function type, possibly), without using a typedef, unless we allow case (int, int)(var foo, var bar): .... It is different from case (int foo, int bar) which doesn't cast the object to a record type, and we need to do that to check whether an Object is a record type. Also needs a specific record type.
(Since functions have no interesting properties, it's not as big a problem to not be able to write a function type in front).

It also makes it unclear (or very clear, but slightly surprising) what happens if the target is typed as dynamic.

Can you do:

case (o as dynamic) {
  case (foo:, bar:): print("$foo:$bar");
}

If yes, then it's doing dynamic .foo and .bar lookups, but only if the object has those.
If the object has no foo or bar property, the case is rejected without trying to read that property - no triggering noSuchMethod. (Is foo read before bar is checked for?)

That means it's actually possible to check whether an object has a member named n by doing

bool hasN(Object o) {
  try {
    if (o as dynamic case (n: _)) return true;
  } catch (e) {
    return true;
  }
  return false;
}

That method distinguishes an object with an n member, even one which throws when read, from one without an n member, even one with a custom noSuchMethod.

So, maybe just disallow property checks on dynamic?

That's a new feature: Being able to tell whether an object has a member, without necessarily reading it.
(Is it read if the pattern is _? Probably, but not certainly.)

@munificent
Copy link
Member Author

If yes, then it's doing dynamic .foo and .bar lookups, but only if the object has those.
If the object has no foo or bar property, the case is rejected without trying to read that property - no triggering noSuchMethod. (Is foo read before bar is checked for?)

The current proposal doesn't do that. It just unconditionally attempts to call the getter, which will throw if the getter isn't actually defined. The static semantics say:

Calculate the type of f's named field subpatterns using the return type of the getter on M with the same name as the field as the matched value type. If M is dynamic, then use dynamic as the matched value type. It is a compile-time error if M is not dynamic and does not have a getter whose name matches the subpattern's field name.

And then the runtime semantics:

Call the corresponding getter on v to get result r. ... If v has type dynamic, this getter call may throw a NoSuchMethodError, which we allow to propagate instead of treating that as a match failure.

I think it would be weird if matching on a value of type dynamic exposed a new reflective capability you couldn't get elsewhere.

@lrhn
Copy link
Member

lrhn commented May 3, 2022

It would be weird to have structural checks on objects at type dynamic, but not at any other type.

But that also means that purely structural pattern matching against dynamic is useless since it cannot fail. You need to do the type test and promote..

That is:

  switch (dynamicValue) {
    case (foo: int x, bar: int y): ... // always matches, sometimes throws.
    case ...: unreachable, first match completely exhausts.
  }

That's not a bad thing, it just means you need to have a type in the test pattern:

 switch (dynamicValue) {
   case Foo(foo: int x, bar: int y): ... // only matches Foo's.
   case Bar(...): now reachable
 }

And that's why we need a way to select a record type for matching, even if it's just:

  swtich (dynamicValue) {
    case $<(foo: int, bar: int)>(foo: var x, bar: var y): ...
    ...
  }

(with typedef $<T> = T;).

What if we introduced a "type literal" syntax: < type >, which is always interpreted as a type, either in type context or as type literal. You don't need the < and > for all types, but for record types and function types in some contexts, you can add the <...> to disambiguate.
Then it would be:

  case <(foo: int, bar: int)>(foo: var x, bar: var y): ...

I don't think < can currently start an expression, and because it can only contain one type, commas won't be able to separate the < and >, and we don't allow chained < and > operators, so it just might be parsable.

@munificent
Copy link
Member Author

The revamped patterns proposal addresses this. Record patterns now only match record objects and you have to use an extractor to call getters on arbitrary objects.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
patterns Issues related to pattern matching.
Projects
None yet
Development

No branches or pull requests

2 participants