Skip to content

Anonymous methods #260

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
eernstg opened this issue Mar 8, 2019 · 32 comments
Open

Anonymous methods #260

eernstg opened this issue Mar 8, 2019 · 32 comments
Labels
feature Proposed language feature that solves one or more problems

Comments

@eernstg
Copy link
Member

eernstg commented Mar 8, 2019

In response to #259, this issue is a proposal for adding anonymous methods to Dart.

Edit: We could use the syntax e -> {...} in order to maintain syntactic similarity with the pipe operator proposed as #43, as mentioned here. See #265 for details. Later edit: Changed generic anonymous methods to be a separate extension of this feature, such that it is easier to see how it works without that part.

Examples

The basic syntax of an anonymous method is a term of the form .{ ... } which is added after an expression, and the dynamic semantics is that the value of that expression will be "the receiver" in the block, that is, we can access it using this, and methods can be called implicitly (so foo(42) means this.foo(42), unless there is a foo in the lexical scope).

To set the scene, take a look at the examples in #259, version 1 (original), 2 (unfolded), and 3 (folded). Here is a version of that same example that we could write using anonymous methods:

// Version 4, using anonymous methods.

void beginFrame(Duration timeStamp) {
  // ...
  ui.ParagraphBuilder(
    ui.ParagraphStyle(textDirection: ui.TextDirection.ltr),
  ).{
    addText('Hello, world.');
    build().{
      layout(ui.ParagraphConstraints(width: logicalSize.width));
      canvas.drawParagraph(this, ui.Offset(...));
    };
  };
  ui.SceneBuilder().{
    pushClipRect(physicalBounds);
    addPicture(ui.Offset.zero, picture);
    pop();
    ui.window.render(build());
  };
}

In this version the emphasis is on getting the complex entities (the objects that we are creating and initializing) established at first. With the given ParagraphBuilder in mind, we can read the body of the anonymous method (where we add some text to that paragraph builder and build it). We continue to work on the paragraph returned by build, doing a layout on it, and then using it (this) in the invocation of drawParagraph.

Similarly, the second statement gets hold of the scene builder first, and then works on it (with three method invocations on an implicit this, followed by one regular function call to render).

Compared to version 3, this version essentially turns the design inside out, because we put all the entities on the table before we use them, which allows drawParagraph and render to receive some much simpler arguments than in version 3.

Compared to version 2, this version allows for a simple use of statements, without the verbosity and redundancy of using names like paragraphBuilder many times. This means that we can use non-trivial control flow if needed:

ui.SceneBuilder().{
  pushClipRect(physicalBounds);
  for (var picture in pictures) {
    addPicture(randomOffset(), picture);
  }
  pop();
  ui.window.render(build());
};

We can of course also return something from an anonymous method, and we can use that to get to a sequential form rather than the nested one shown above:

ui.ParagraphBuilder(
  ui.ParagraphStyle(textDirection: ui.TextDirection.ltr),
).{
  addText('Hello, world.');
  return build();
}.{
  layout(ui.ParagraphConstraints(width: logicalSize.width));
  canvas.drawParagraph(this, ui.Offset(...));
};

We can choose to give the target object a different name than this:

main() {
  "Hello".(that){ print(that.length); };
}

We can use a declared type for the target object in order to get access to it under a specific type:

main() {
  List<num> xs = <int>[];
  xs.(Iterable<int> this){ add(3); }; // Dynamic check.
}

Note that this implies that there is an implicit check that xs is Iterable<int> (which may be turned into a compile-time error, e.g., by using the command line option --no-implicit-casts). Also, the addition of 3 to xs is subject to a dynamic type check (because xs could be a List<Null>).

As a possible extension of this feature, we can provide access to the actual type arguments of the target object at the specified type:

main() {
  List<num> xs = <int>[];
  xs.(Iterable<var X> this){ // Statically safe.
    print(X); // 'int', not 'num'.
    num x = 3; // Note that `xs.add(x)` requires a dynamic check.
    if (x is X) add(x); // Statically safe, no dynamic checks.
  };
}

We are using type patterns, #170, in order to specify that X must be bound to the actual value of the corresponding type argument for xs. We also specify that X must be a subtype of num, such that the body of the anonymous function can be checked under that assumption, and this is a statically safe requirement because the static type of xs is List<num>.

Finally, if the syntax works out, we could extend this feature to provide a new kind of function literals. Such a function literal takes exactly one argument which is named this, and the body supports implicit member access to this, just like an instance method and an anonymous method:
use an anonymous method as an alternate syntax for a function literal that accepts a single argument (and, of course, treats that argument as this in its body):

main() {
  List<String> xs = ['one', 'two'];
  xs.forEach(.{ print(substring(1)); }); // 'ne', 'wo'
}

Proposal

This is a draft feature specification for anonymous methods in Dart.

Syntax

The grammar is modified as follows:

<cascadeSection> ::= // Modified
    '..'
    <cascadeHeadSelector>
    <cascadeTailSelector>*
    (<assignmentOperator> <expressionWithoutCascade>)?

<cascadeHeadSelector> ::= // New alternative added
    <cascadeSelector> <argumentPart>*
  | <anonymousMethod>

<cascadeTailSelector> ::= // New alternative added
    <assignableSelector> <argumentPart>*
  | <anonymousMethodSelector>

<selector> ::= // New alternative added
    <assignableSelector>
  | <anonymousMethodSelector>
  | <argumentPart>

<anonymousMethodSelector> ::= // New
    '.' <anonymousMethod>
  | '?.' <anonymousMethod>

<anonymousMethod> ::= // New
    <typeParameters>? ('(' <normalFormalParameter> ')')? <block>

<simpleFormalParameter> ::= // Modified
    <declaredIdentifierOrThis>
  | 'covariant'? <identifierOrThis>

<identifierOrThis> ::= <identifier> | 'this'

<declaredIdentifierOrThis> ::=
    'covariant'? <finalConstVarOrType> <identifierOrThis>

Static Analysis

It is a compile-time error if a formal parameter is named this, unless it is a parameter of an anonymous method or a function literal.

An anonymous method invocation of the form e.{ <statements> } or e..{ <statements> } is treated as e.(T this){ <statements> } respectively e..(T this){ <statements> }, where T is the static type of e.

An anonymous method invocation of the form e?.{ <statements> } is treated as e?.(T this){ <statements> } where T is the static type of e (with non-null types, T is the non-null type corresponding to the static type of e).

The rules specifying that an expression e starting with an identifier id is treated as this.e in the case where id is not declared in the enclosing scopes remain unchanged.

However, with anonymous methods there will be more situations where this can be in scope, and when an anonymous method is nested inside an instance method, the type of this will be the type of the receiver of the anonymous method invocation, not the enclosing class.

In an anonymous method invocation of the form e.(T id){ <statements> } or e..(T id){ <statements> }, it is a compile-time error unless the static type of e is assignable to T. (Note that id can be this.) Moreover, it is a compile-time error if T is dynamic.

It is a compile-time error if a statement of the form return e; occurs such that the immediately enclosing function is an anonymous function of the form e..(T id){ <statements> }. This is because the returned value would be ignored, so the return statement would be misleading.

The static type of an anonymous method invocation of the form e.(T id){ <statements> } is the return type of the function literal (T id){ <statements> }. The static type of e?.(T id){ <statements> } is
S?, where S is the return type of the function literal (T id){ <statements> }.

The static type of an anonymous method invocation of the form e..(T id){ <statements> } is the static type of e.

Dynamic Semantics

Evaluation of an expression of the form e.(T id){ <statements> } proceeds as follows: e is evaluated to an object o. It is a dynamic error unless the dynamic type of o is a subtype of T. Otherwise, (T id){ <statements> }(o) is evaluated to an object o2, and o2 is the result of the evaluation.

Evaluation of an expression of the form e?.(T id){ <statements> } proceeds as follows: e is evaluated to an object o. If o is the null object then the null object is the result of the evaluation, otherwise it is a dynamic error unless the dynamic type of o is a subtype of T. Otherwise, (T id){ <statements> }(o) is evaluated to an object o2, and o2 is the result of the evaluation.

Evaluation of an expression of the form e..(T id){ <statements> } proceeds as follows: e is evaluated to an object o. It is a dynamic error unless the dynamic type of o is a subtype of T. Otherwise, (T id){ <statements> }(o) is evaluated to an object o2, and o is the result of the evaluation.

Discussion

As mentioned up front, we could use -> rather than . to separate the receiver from an associated anonymous method, which would make this construct similar to an application of the pipe operator (#43).

This might be slightly confusing for the conditional variant (where we would use ? -> rather than ?.) and the cascaded variant (where we might use --> rather than .., and ? --> rather than ?.., if we add null-aware cascades).

It might be useful to have an 'else' block for a conditional anonymous method (which would simply amount to adding something like ('else' <block>)? at the end of the <anonymousMethodSelector> rule), but there is a syntactic conflict here: If we use else then we will have the combination of punctuation and keywords (e?.{ ... } else { ... } ). Alternatively, if we use : then we get a consistent use of punctuation, and we get something which is similar to the conditional operator (b ? e1 : e2), but it will surely surprise some readers to have : as a larger-scale separator (in e?.{ ... } : { ... }, the two blocks may be large).

Note that all other null-aware constructs could also have an 'else' part, specifying what to do in the case where the receiver turns out to be null, such that the expression as a whole does not have to evaluate to the null object.

Note that we could easily omit support for this parameters in function literals, or we could extend the support to even more kinds of functions.

We insist that the receiver type for an anonymous method cannot be dynamic. This is because it would be impractical to let every expression starting with an identifier denote a member access on that receiver:

main() {
  dynamic d = 42;
  d.{ print(this); }; // Oops, `42` does not have a `print` method!
}

However, another trade-off which could be considered is to allow a receiver of type dynamic, but give it the type Object in the body.

@kasperpeulen
Copy link

kasperpeulen commented Mar 9, 2019

Brilliant, applause from me 👏 .

You could also combine this with the cascade operator and have:

object
  ..{ print(this) }
  .methodOnObject()

I wonder if one liner syntax is possible:

string
  .trimLeft()
  .((it) => it + "!")

@lrhn
Copy link
Member

lrhn commented Mar 11, 2019

This is in some ways similar to #43 (pipe-operator).

You take the result of an expression and use it as the first argument to a following function, e1 -> function(). Here you can omit the arguments of a function and have it be an implicit (this), which is clever, but the cases where you do specify the argument, it's less powerful than #43, which can accept existing functions and extra arguments to those (foo->print() vs foo.{print(this)}).

If we extend #43 with e1 -> { block where this is bound to value of e1 }, then I think it would be the best of two worlds.

The feature makes the block work as a function body, so 42.{ return this; } evaluates to 42.
Another option is to make it an inline block in the surrounding function where this is bound to something else. Then 42.{return this;} would return 42 from the surrounding function.
In that case e1.{...} is a statement, not an expressions, but we also preserve the property that a separate function body is recognizable by its preceding parameters.

@eernstg
Copy link
Member Author

eernstg commented Mar 11, 2019

@tatumizer wrote:

For Iterable - I don't understand it.

The point is that e.{ <statements> } is an abbreviation of e.(T this){ <statements> } where T is the statically known type of e. The extension to allow type arguments is needed in order to make such a method similar to an instance method (because instance methods have access to the type arguments of the class), but we would need to specify the names of those type arguments:

class A<X, Y> {}
class B<Z, W> extends A<W, Z> {} // Note the swap!
A<int, String> foo() => B();

main() {
  var x = foo();
  x.{ ... }; // How would we access Z, W, X, Y here?
}

It would surely be too hard to read (and error prone) if we were to say that X and Y are in scope in the body of the anonymous method, just because those are the declared names of the type parameters in the class A.

It would also be quite confusing if we just specify that we wish to access the type variables under the names X and Y, without saying that it's the type variables for A, not B—we would then be able to swap the binding (such that X is the second type parameter of A and Y is the first one) if we change the return type of foo to B<String, int> (noting that B<String, int> <: A<int, String>, and the body already has this type).

So if we want to allow the anonymous method to have access to the actual type arguments of the receiver it must be at a specified type, and those type parameters will have to be declared explicitly. xs.<X>(Iterable<X> this).{ ... } does just that.

For "xs.forEach" - "this" is too long and confusing IMO.

That's a trade-off, of course. My intention was to insist on the association between implicit member access and the reserved word this, such that developers can rely on foo(42) to mean foo(42) (because foo is declared in an enclosing scope), otherwise it means this.foo(42) (because foo is not in the lexical scope, but it is a member of the type of this), and otherwise it's an error.

It is true that we may now need to think a little more in order to understand what this means, but we'd always be able to find it by means of a scan of enclosing {} blocks: If we're inside an anonymous method then this is the receiver for that, otherwise this is the current instance of the enclosing class.

We could use a different name like it, and we could require that it is mentioned every time, but this would detract from the conciseness of the mechanism:

// Version 5, using an explicit `it` to denote the receiver.

void beginFrame(Duration timeStamp) {
  // ...
  ui.ParagraphBuilder(
    ui.ParagraphStyle(textDirection: ui.TextDirection.ltr),
  ).{
    it.addText('Hello, world.');
    it.build().{
      it.layout(ui.ParagraphConstraints(width: logicalSize.width));
      canvas.drawParagraph(it, ui.Offset(...));
    };
  };
  ui.SceneBuilder().{
    it.pushClipRect(physicalBounds);
    it.addPicture(ui.Offset.zero, picture);
    it.pop();
    ui.window.render(it.build());
  };
}

The use of forEach is different, by the way, because it uses the anonymous method syntax to express a function literal: xs.forEach(.{ print(substring(1)); }) is the same thing as xs.forEach((x) { print(x.substring(1)); }).

The point is that we could choose to introduce this kind of function literal, based on the similarity to anonymous methods, and they would of course be all about the implicit use of this (otherwise we'd just consider various other abbreviated forms of function literal, like _.length as an abbreviation of (x) => x.length).

@eernstg
Copy link
Member Author

eernstg commented Mar 11, 2019

@kasperpeulen wrote:

You could also combine this with the cascade operator

That's the intention! (I'm going to work on the 'proposal' part later today), and also e?.{ ... }.

one liner syntax

That's rather tricky, because we'd need to disambiguate the end of the function body:

myString.(it) => it + "!"

Does that parse as (myString.(it) => it) + "!" or as myString.(it) => (it + "!")?

@eernstg
Copy link
Member Author

eernstg commented Mar 11, 2019

@tatumizer wrote:

we can now write ..

Adding in the semicolon that we will presumably need (until we settle on something in the area of optional semicolons):

var x = 0.{ // receiver can be anything
   // long computation
  return result;
};

Right, this means that we could use anonymous methods to express "block expressions". I'm not sure it would be recommended in the style guide, but it's an interesting corner that I hadn't thought of. ;-)

@eernstg
Copy link
Member Author

eernstg commented Mar 11, 2019

@lrhn wrote:

.. extend #43 with e1 -> { block where this is bound to value of e1 }

That's definitely an interesting idea! The pipe operator will surely call for the addition of multiple returns (such that we can do f -> g also when g takes more than one argument), and that might be handled by something like this:

(e1, e2) -> (T x, S this){ ... }

which would allow us to declare names for all the incoming objects (and, presumably, we would still be able to use this for one of them, and get the implicit access to members).

We should be able to do this later even in the case where we set out with a more restricted model (exactly one incoming object).

@yjbanov
Copy link

yjbanov commented Mar 11, 2019

This looks similar to the argument blocks idea, but with a focus on mutable classes. It relies on the property of the implicit this supporting side effects. While such code exists in Flutter and some other areas (e.g. builder in package:built_value), I think it would be 10x more interesting if it could work with immutable objects. For example, can this be made applicable in build() functions?

I think we should be very careful about introducing conveniences for mutables or other code that deals with side-effects. We do not want to discourage immutability even further.

@eernstg
Copy link
Member Author

eernstg commented Mar 11, 2019

@tatumizer wrote:

we have nested "this" which is very confusing (javascript is famous for that).

In JavaScript there are a bunch of reasons why the meaning of this is confusing (e.g., see this discussion on StackOverflow).

In Dart we have a well-defined notion of lexical scoping, and the meaning of this is simply the current instance of the enclosing class. If we add support for this with implicit member access in anonymous methods then we would still have a lexical criterion: Search up through enclosing {} blocks and this denotes the innermost one that is a class or an anonymous method.

So, granted, it does make this a bit more involved, but I don't think it's a huge cost.

@eernstg
Copy link
Member Author

eernstg commented Mar 11, 2019

@tatumizer wrote:

Everything looks consistent here

True, it does add up quite nicely!

@lrhn
Copy link
Member

lrhn commented Mar 11, 2019

The question here is what the goal of the operation is.

If it is to make implicit this-calls possible, as a way to reduce the syntactic overhead of introducing a variable and writing x. before every invocation, then nothing short of re-binding this will solve it.
I'm not sure that's worth it.

If the goal is to be able to introduce something that otherwise works as a method on an object, locally, then I think it's a red herring. A method added on the side can't do anything that a simple function can't, except for the this. shorthand. Looking like a method isn't itself useful.

Having a number of statements in an expression context is also not new, an immediately applied function literal will do that for you.

What would be new and useful is a way to introduce a statement in expression context without entering a new function body, allowing control flow in and out of the expression. That's probably also a tall order implementation-wise (I'm sure the VM assumes that an expression cannot break, continue, return or yield, and allowing that likely won't be worth the effort).

So, all in all, the only thing really useful I see here is a way to locally rebind this, something like with (something) { assert(this == something); } (to use JS syntax). Making it a selector might be useful, but I fear that having a block embedded in another expression which is really a function body, but which doesn't look like it (no parameters), can hurt readability.

@eernstg
Copy link
Member Author

eernstg commented Mar 11, 2019

@yjbanov wrote:

This looks similar to the argument blocks idea

I proposed similar things already a couple of years ago and it's definitely possible that these ideas blend in with each other over time. I actually thought that the named argument blocks would be intended to allow for something like passing named arguments (using a sequence of assignment-ish constructs) rather than being regular blocks (containing full-fledged statements), but it's certainly not hard to see the similarities.

We do not want to discourage immutability even further

I want to encourage immutability, and support developers in knowing and enforcing it. We would presumably do this by adding support for value classes, and I hope we'll get around to adding that soon.

However, I don't think my proposal here is particularly strongly tied to mutable state.

It is true that it is concerned with the provision of a scope where statements can be executed, and statements are frequently used to specify computations with side effects. There's nothing that prevents you from using an anonymous method with a receiver which is immutable. It just means that the available methods on that object won't mutate it. For instance, you might want to use the methods in order to extract various parts of the immutable state of the receiver and passing them on to functions.

can this be made applicable in build() functions?

I would expect this to be an obvious use case for anonymous methods (and they would presumably fit in quite well with the ...Builder objects being used in there, because they will build an object in an imperative style, i.e., based on several method invocations on the builder).


But I guess the main point you are making is that we should have at least similarly good support for the creation of immutable objects as the support that we have for building an object graph of mutable objects (using anonymous methods, or whatever we have).

I think this would mainly be concerned with the ability to pass complex arguments in an argument list that would be used to build a new immutable object, and the ability for developers to read that kind of code.

We do have Flutter's widespread usage of named arguments and a trailing comma based formatting for that, but I do recognize that we might want improvements here as well.

@eernstg
Copy link
Member Author

eernstg commented Mar 11, 2019

@lrhn wrote:

The question here is what the goal of the operation is.

If e.{ ... } is concise and readable because it combines implicit member access with full-fledged statements in an expression context then I think it's OK to consider the mechanism based on the combination of properties that it actually offers, and I'm not really convinced that we would need to designate a single property as the "goal" of this mechanism.

Of course, we could use (T this){ ... }(e) and then conclude that e.{ ... } is unnecessary, or we could use (T it){ ... }(e) and edit the body to mention it whenever needed, but you could use a similar type of argument to get rid of lots and lots of other language constructs as well.

What would be new and useful is a way to introduce a statement
in expression context without entering a new function body,

That would indeed be useful, because it would allow us to use return to return from the enclosing function/method rather than from the anonymous method, and use await relative to the enclosing function/method, etc.

But, as you mention, that might also have some more complex implications for the implementation. The anonymous methods that I proposed here should be rather easy to implement (the feature spec already hints that they could be mostly syntactic sugar).

@lrhn
Copy link
Member

lrhn commented Mar 12, 2019

Indeed my rule for whether a specialized construct is worth it when a general construct already enables the functionality is that:

  • The specialized functionality is signficantly "better" (e.g., more readable, concise, and less error prone).
  • The situation where the specialized functionality applies occurs "often enough".
  • It's a good language construct that fits in the language.

This comparison is also applied against potential features, not just existing ones. If we can add another good feature which is more general, and makes the improvement of this feature insignfiicant, then we should probably do that instead.

For the foo.{ this bound here } construct, my main objections are that I don't think it happens often enough (mainly because we already have cascades which handles the simple cases) and that it's not readable enough (because it's not clear that this is a separate function body with its own returns).

Going to foo.<T>(T x) { .... } makes it a non-significant improvement over <T>(T x) {...}(foo). The chaining is nice, but the pipe operator is more generally useful and would allow foo->(<T>(T x){...})() if you really need it.

Making the nested block look like a method does not give any significant advantage, it's the localized change of this which is the main feature, because it allows implicit this calls to be shorter than explicit it.something() calls. All other differences are not significant improvements, or are better handled by a pipe operator (and so we should introduce that instead, I wouldn't want to have both).

Since the "method" has access to local variables in a different function, it's not really a method anyway, it's a different syntax for a function literal (including the inferral of return type) which is immediately invoked, and with the option of nameing the parameter this

Another option is to allow any function to denote one of its positional parameters as this, not just this construct. That would be more generally useful, a static helper function could meaningfully treat one argument as the main operand.
Then, with a pipe operator, you can do foo->(this){addSomething();addSomethingElse();}(). Not as clean, but still usable, and again the improvement of the proposed feature over this is not significant enough.

@eernstg
Copy link
Member Author

eernstg commented Mar 13, 2019

I've explored the syntactic implications of allowing Kotlin-ish abbreviated function literals (like { print(it); } where we get an implicitly declared parameter list of the form (it), and type inference may provide the parameter type): #265. Looks like we can easily do that.

@eernstg
Copy link
Member Author

eernstg commented Mar 13, 2019

@lrhn wrote:

Indeed my rule for whether a specialized construct is worth it
when a general construct already enables the functionality is that:

  • The specialized functionality is signficantly "better" (e.g., more
    readable, concise, and less error prone).
  • The situation where the specialized functionality applies occurs "often enough".
  • It's a good language construct that fits in the language.

I think it's useful to allow for the syntactic reorganization whereby a complex expression (say, a constructor invocation with many arguments) can be pulled up in front of a portion of code where the resulting object can be accessed by name. Anonymous methods will do that, and so will abbreviated function literals (#265) along with the pipe operator.

I also think it's useful to let such a locally-prominent object be accessed implicitly by binding this in a scope and adding this to expressions in the usual manner. I think it's useful for comprehension and readability and we can use the concept of a "method" to indicate that this has a binding and is used implicitly in the same way as it is in an instance method. This matches up with the properties of anonymous methods. For the abbreviated function literal + pipe operator combination, I couldn't immediately see how we would make the choice to bind this or not, and how we would justify it conceptually.

@lrhn, do you have a good solution for that? And how about cascades of anonymous methods, do we need a cascade pipe? I'm not totally convinced that the following two forms will be considered equally convenient:

void foo() { ... } // Not an instance method.

main() {
  // 1.
  var x1 = SomeClass(); // with `add..` methods.
  x1 -> (this){ addSomething(); foo(); addSomethingElse(); }();

  // 2.
  var x2 = SomeClass()..{ addSomething(); foo(); addSomethingElse(); };
}

I suspect that the situations where we will want to work on an anonymous object will arise "often enough", taking the clue that Kotlin uses exactly this combination (binding of this in a function whose type is "with receiver") to achieve that highly visible feature that they call type safe builders.

This is the primary reason why I chose the Flutter example where some non-trivial object building is being done. Of course, there would be other cases as well, but it seems obvious that we might want to avoid creating a lot of names for objects that we are just touching briefly, because they are being configured and then added to an object graph that we are currently building.

Finally, I tend to think that anonymous methods do fit into the language. YMMV. ;-)

Of course, I already proposed to make the provision of a binding of this available elsewhere in the language, that just falls out. You might want to be able to use a different name than this for the variable that supports implicit member access (but I suspect that this would be bad for readability, so I did not propose that).

I don't actually see much of a problem in defining e.{...} to mean ((T this) {...}(e)) or (e -> (T this) {...}), (note that we do need the parentheses), Even though that is again a very thin layer of syntactic sugar, we do have other pieces of sugar that are similarly thin, and I think the anonymous method form may actually be convenient in practice.

Since the "method" has access to local variables in a different function,
it's not really a method anyway

I don't really buy that. The point is that implicit access to members of the type of this can be understood in terms of being an instance method, and lexically enclosing declarations can be seen on screen. I think this combination will make it relatively easy for developers to get a good intuition about how it works.

In contrast, I suspect that the ability to introduce implicit member access for an arbitrary parameter of an arbitrary function would make the code considerably harder to read, because there is no simple mental model that developers can rely on, like saying "this is like a method on object o".

So we should probably not be extremely permissive when we decide how to support binding of this, but I do think that we should exploit the concept of "a method" to make this mechanism comprehensible.

@eernstg eernstg added breaking-change feature Proposed language feature that solves one or more problems and removed breaking-change labels Mar 13, 2019
@eernstg
Copy link
Member Author

eernstg commented Mar 14, 2019

In #267, I've explored the foundations of the binding of this and the associated implicit member access: It is of course just an application of a more general underlying mechanism which adds names to a given name space. If we wish to be very orthogonal then we should introduce that more general mechanism (local imports), and specify the semantics of anonymous methods in terms of a desugaring which uses local imports.

@eernstg
Copy link
Member Author

eernstg commented Dec 27, 2019

@tatumizer wrote:

re-interpret "anonymous methods" as "anonymous extension methods"

Interesting idea, thanks! I wouldn't necessarily want to prevent receivers of type dynamic, though. I want to empower developers who want to avoid dynamic operations to do so, but I don't want to gratuitously worsen the support for writing code where dynamic operations are used because that's the appropriate solution in a given software development situation.

@eernstg
Copy link
Member Author

eernstg commented Jan 4, 2020

@tatumizer wrote:

the ceremony of casting dynamicVar as Object? with the goal of applying
the extension method doesn't seem well-justified

Right. @lrhn already described the conceptual reason why we decided to make implicit extension method invocations on a dynamic receiver impossible: Instance methods should dominate extension methods, and the dynamic receiver is assumed to have all instance members.

It's tempting to ask why all those receivers are dynamic in the first place, but I suspect that it would not work well enough to change all type annotations to avoid dynamic (so it would be Map<String, Object?> rather than Map<String, dynamic> in this context. One obvious conflict is that we would need an extension method for operator [], applicable to Object?, and that would pollute every piece of code where the relevant extensions are in scope ("everything" would have an operator [], but we may only want to add it to near-typeless data structures used to model json values).

But that's basically because this is a job for 'extension types' (#42), that I prefer to call 'views' or 'view types' because they are concerned with viewing a specific loosely typed data structure as having a strictly specified typing structure, and enforcing that it is used accordingly. There is no run-time representation of the view, so it's a zero-cost abstraction, not unlike a Haskell newtype.

The idea is that a loosely typed data structure with root v (say, modeling json values, consisting of objects of type Map<String, Object?>, String, num, int, bool, or Null) is used according to a specified view type. So v is the value of a variable whose type is a view type V, and this means that it is only allowed to access v using the methods declared in V. V could have a getter foo returning num and a getter bar returning V2 which is another view type. As long as v is a data structure which is actually structured as described in the view types the invocations of view methods are safe, but if v has some other structure (say, if it maps foo to true which is not of type num) then we get a dynamic type error.

In summary, I think it's more promising to address this whole area by means of view types, and then the special case of invoking static extension methods on receivers of type dynamic can be ignored: It seems likely that there is no good and consistent ruleset which allows extension methods to be invoked on receivers of type dynamic, and even if we were to allow such things it would probably be an incomplete and inconvenient solution.

@eernstg
Copy link
Member Author

eernstg commented Jul 28, 2023

Here is an example showing that anonymous methods allow us to use a style which is rather similar to the one used in Kotlin with "UI as code".

In the example, we're building a tree using anonymous methods. A class instance extension member is used in order to allow the addition of a tree node to its parent in a concise way, again similar to the way it's done in Kotlin (we're using operator ~ where Kotlin uses the unary + operator, because Dart doesn't have unary +):

class Tree {
  final String value;
  final List<Tree> children = [];

  Tree(this.value);

  String toString() =>
      '$value(${[for (var c in children) c.toString()].join(', ')})';
  void (Tree t).operator ~() => children.add(t);
}

Tree build({required bool third}) {
  return Tree('n1').{
    ~Tree('n11');
    (~Tree('n12')).{
      ~Tree('n121');
      ~Tree('n122');
    };
    if (third) ~Tree('n13');
  };
}

void main() {
  print(build(third: true)); // 'n1(n11(), n12(n121(), n122()), n13())'.
}

The main point is that we're building a tree simply by "mentioning" the subtrees (with the ~ in front) in a {...} block which is associated with the parent node (and it is an anonymous method, of course).

This is possible because the ~ invokes the code that adds the syntactic receiver (in ~e, the syntactic receiver is e) to the list of children of the this of the anonymous method, which is the parent.

The parentheses around ~Tree('n12') are not so convenient, but we could give the anonymous method construct a lower precedence in the grammar (presumably they would be just below unaryExpression). This would allow prefix operators on the receiver without the extra parentheses, it would allow await e.{/*code working on the result, not the future*/}, and several other things.

We could also use a getter. This time we'll return the syntactic receiver, just to see how it looks:

class Tree {
  ...
  Tree get (Tree t).add {
    children.add(t);
    return t;
  }
}

Tree build({required bool third}) {
  return Tree('n1').{
    Tree('n11').add;
    Tree('n12').add.{
      Tree('n121').add;
      Tree('n122').add;
    };
    if (third) Tree('n13').add;
  };
}

We could also avoid the class instance extension members entirely, and use a regular add method. However, this approach introduces some parentheses, and some of them would contain a potentially rather large amount of code:

class Tree {
  ...
  void add(Tree t) => children.add(t);
}

Tree build({required bool third}) {
  return Tree('n1').{
    add(Tree('n11'));
    add(Tree('n12')..{
      add(Tree('n121'));
      add(Tree('n122'));
    }); // This `)` is far away from its `(`.
    if (third) add(Tree('n13'));
  };
}

Note that we use Tree('n12')..{/*code*/} for the anonymous method on the tree named 'n12', because we need to add the node itself to the parent. We could also have used Tree('n12').{ ... return this; }, but a cascade is the standard way to hold on to the receiver in Dart, so we use that. Finally, we could have used (add(Tree('n12'))).{} if add had returned its argument, but we might well need to return something else from add for other reasons.

There are many options, but the use of a prefix operator seems to be rather pragmatic, especially if the grammar is adjusted such that anonymous methods have a lower precedence (just below unary operators, presumably).

@eernstg
Copy link
Member Author

eernstg commented Jan 9, 2025

Here's an example that came up recently:

We may want to give a name to the result of a computation such that we can use it more than once, but we want to express this in a concise and self-contained manner (in particular, it's assumed that we don't need to use that result in the following lines of code). Here is the original example:

// Version 1.

if (await nativeDriver.sdkVersion case var version when version < 23) {
  fail('Requires SDK >= 23, got $version');
}

In this example, an if-case statement is used to introduce the name version for the value of await nativeDriver.sdkVersion, and it is then used twice: In the when clause and in the body. In general, we might want to use it any number of times, using different control structures or other language features, but this is a particularly perfect match because we can perform a test in the when clause, and then we can perform the computation in the body. Just consider the variant where we want an 'else' branch as well:

// Version 1.1.

if (await nativeDriver.sdkVersion case var version when version < 23) {
  fail('Requires SDK >= 23, got $version');
} else {
  print('This is an even SDK: ${version.isEven}.'); // Oops, `version` is not in scope here!
}

You could say that it is just a more complex way to do the following:

// Version 2.

var version = await nativeDriver.sdkVersion;
if (version < 23 ) {
  fail('Requires SDK >= 23, got $version');
}

This is in a sense more general because we can write arbitrary code using version in the lines following the declaration of version. For example:

// Version 2.1

var version = await nativeDriver.sdkVersion;
if (version < 23 ) {
  fail('Requires SDK >= 23, got $version');
} else {
  print('This is an even SDK: ${version.isEven}.'); // OK!
}

However, this introduces a slightly higher cost on the enclosing body of code because it introduces a new name whose scope isn't immediately visible. So we can't know for sure that we do not need to remember everything about version when we continue to read the following lines of code. We could enclose the whole thing in a block, but that's two more lines, and indentation, so that probably won't happen.

If we're doing the same thing using an anonymous method then the value of version in the previous examples will be available as this in the body of the anonymous method:

// Version 3.

await nativeDriver.sdkVersion..{
  if (this < 23) {
    fail('Requires SDK >= 23, got $this');
  }
};

// Version 3.1

await nativeDriver.sdkVersion..{
  if (this < 23) {
    fail('Requires SDK >= 23, got $this');
  } else
    print('This is an even SDK: $isEven.'); // `this` can be omitted as usual.
  }
};

This form is more powerful than version 1 and 1.1 because it's an expression rather than a statement. This means that we can use it in contexts where an expression is allowed (that is, "everywhere"). It also means that we can provide results to the context. With the form shown above, the result is the value of await nativeDriver.sdkVersion because it's a cascade, but we could also use (await nativeDriver.sdkVersion).{ ... return someOtherResult; ...}) to deliver some other result to the context. Also, version 3 and 3.1 do not pollute the local namespace with a new name, as opposed to version 2 and 2.1.

Finally, it should be noted that the anonymous method in version 3.1 can be compiled into code that works exactly like version 2.1, which means that there is no added run-time cost:

// Good compilation strategy for version 3.1; `freshName` is generated by the compiler.

{
  var freshName = await nativeDriver.sdkVersion;
  if (freshName < 23 ) {
    fail('Requires SDK >= 23, got $freshName');
  } else {
    print('This is an even SDK: ${freshName.isEven}.');
  }
}

The general compilation strategy is that e.{ S } is replaced by ((T freshName) { S1 })(e) where T is the static type of e and S1 is like the statement list S except that implicit usages of this are made explicit, and then every occurrence of this is replaced by freshName, but the translation in the style of version 2.1 yields a better run-time performance, and it can be used whenever it is semantics preserving.

@Levi-Lesches
Copy link

Levi-Lesches commented Jan 9, 2025

We could enclose the whole thing in a block, but that's two more lines, and indentation, so that probably won't happen.

I couldn't help but notice that version 3.1 is one more line, and indentation ;)

I would actually favor version 2.1 in general:

  • it's flatter
  • it uses a plain old variable
  • it's a relatively short line with no new/special grammar

I'm not sure introducing -- and having to read -- a new syntax construct is actually an improvement over introducing a new local variable. In other words, the cognitive overhead of "I am in a new scope now, indicated by this special grammar I would never use elsewhere" is more demanding (to me) than "just another local variable added to the list". When writing functions I tend to find myself defining many local variables as they're sort of "throw-away" values after you're done reading the function anyway.

Besides, if you need the variable in scope for the "if", then suddenly find yourself needing it in the "else"... it's not too far-fetched to imagine you might need it after the else, too. If one really wants to give that value its own scope, I would suggest using a block, defining a new function/method, or creating an anonymous function.

@eernstg
Copy link
Member Author

eernstg commented Jan 10, 2025

I couldn't help but notice that version 3.1 is one more line, and indentation ;)

😁 Granted, 3.1 has indentation, but here's version 2.1 with the block that protects the subsequent code from name pollution, which makes it 8 lines (one more than 3.1):

// Version 2.1 in block.

{
  var version = await nativeDriver.sdkVersion;
  if (version < 23 ) {
    fail('Requires SDK >= 23, got $version');
  } else {
    print('This is an even SDK: ${version.isEven}.'); // OK!
  }
}

I think name pollution is easily overlooked as a source of complexity (just like non-final variables that are effectively final ;-). The point is that the code is easier to read (and write, because that's interleaved with reading) if it has a smaller amount of complexity, but that's a cost which is sneaking up on us as we're thinking about other things. We may never notice the difference, but I'm convinced that the overall effectiveness of development work can be enhanced by avoiding accidental complexity everywhere. We may not even be aware of the reason why the code is easier to work with, we just enjoy the fact that there are fewer moving parts, and it's clearer what is going on.

That's a matter of having good anti-complexity habits, and using confined constructs rather than free floating declared names is one such habit.

Another matter is that the anonymous method constructs are expressions, which means that they can encapsulate computations in a lot of situations where you can't write a statement. For example:

// Using current Dart.

abstract class Cons {
  final int i;
  Cons get next;
  const Cons._(this.i);
  factory Cons(int i, Cons next) = _ConsRegular;
}

class _ConsRegular extends Cons {
  final Cons next;
  const _ConsRegular(super.i, this.next) : super._();
}

class ConsLate extends Cons {
  late final Cons next;
  ConsLate(super.i) : super._();
}

void main() {
  // Create a circular list.
  var myCircle = ConsLate(0);
  Cons cons = myCircle;
  for (var i = 9; i > 0; --i) {
    cons = Cons(i, cons);
  }
  myCircle.next = cons;

  // Use it.
  int counter = 0;
  Cons current = myCircle;
  while (++counter < 15) {
    print(current.i);
    current = current.next;
  }
}

This works, but it adds some complexity to the body of main because it may not be obvious where and how the variables myCircle and cons are intended to be used. Now use an anonymous method:

// Current Dart, again.

void main() {
  var myCircle = ConsLate(0);
  Cons cons = myCircle;
  for (var i = 9; i > 0; --i) {
    cons = Cons(i, cons);
  }
  myCircle.next = cons;
  ...
}

// Using an anonymous method.

void main() {
  Cons myCircle = ConsLate(0)..{
    Cons cons = this;
    for (var i = 9; i > 0; --i) {
      cons = Cons(i, cons);
    }
    next = cons;
  };
  ...
}

This will encapsulate the variable cons (clearly, we don't need to think about that one after the end of the anonymous method), but it also encapsulates the type of myCircle in the sense that we are handling all the ConsLate related stuff inside the anonymous method, as part of the initializing expression for myCircle, and then we can let myCircle have the simpler type Cons, which is all we need during the rest of this function body. Of course, we can also just use var and preserve the inferred type ConsLate, if that's the preferred style.

@mateusfccp
Copy link
Contributor

mateusfccp commented Jan 10, 2025

I'm also not sold on the benefits of this. It is something that I would definitely use if we had, but I don't think it's value is greater than the cost to implement it and the syntax overload.

@ghost
Copy link

ghost commented Jan 10, 2025

The same can be expressed today by replacing .. with ;:

void main() {
  Cons myCircle = ConsLate(0); {
    Cons cons = myCircle;
    for (var i = 9; i > 0; --i) {
      cons = Cons(i, cons);
    }
    next = cons;
  }
  ...
}

😄

@TekExplorer
Copy link

I would like to point out that you can't actually set the late next since you never interact with the type that has the one-time setter.

Frankly... Why wouldn't you just make a function?

That example is also something of a code smell - I don't like how you're effectively mutating what is supposed to be a final late value "freely" outside of the class.

It feels... Brittle.

Tldr: that's not a good example.

@Levi-Lesches
Copy link

Levi-Lesches commented Jan 10, 2025

I agree that this would be better off as a function or method with a real, meaningful name as well. If it's a performance thing, I'd understand, but I would never write code like this unless I had measurable proof that using a plain method is harming my performance.

Cons initialize10Cons() {
  final result = ConsLate(0);
  var cons = result;
  for (var i = 9; i > 0; --i) {
    cons = Cons(i, cons);
  }
  result.next = cons;
  return result;
}

void main() {
  final myCircle = initialize10Cons();
  // ...
}

By moving the code away into a separate function/method, we get the same benefits: cons is not in scope, myCircle has type Cons and not ConsLate, and even better, the logic to initialize the cons is outside of main, since the implementation details are not relevant to main

@ghost
Copy link

ghost commented Jan 12, 2025

Upon reflection, the idea may become more useful after a bit of generalization.
Let's define a general block-expression in the form .{ code },
We can write

var x = .{
   //...
   if (something) return somethingElse
}

This is the basis for generalization. The syntax with the leading dot points to the possibility of adding the forms with the receiver:

var x = expression.{code};
var x = expression..{code);
var x = expression?.{code}; // see issue #131
var x = expression?.{code}?.etc;
var x = expression?..{code};

(ref. #131)

There's an option to treat the expression .{code} with no explicit receiver as an equivalent to this.{code}; -then this inside the code will still mean the same this as outside. In the (rare) cases when this doesn't exist - e.g. in the global context - the reference to this inside the code is an error. We may also use the explicit syntax _.{code} to indicate the absence of the receiver.

@TekExplorer
Copy link

TekExplorer commented Jan 13, 2025

That makes me think of:

extension<T> on T {
  R map<R>(R Function(T) cb) => cb(this);
}

I do think that there is value to the proposed syntax

Its actually the raw block expression that I kinda like more. its much better than self-calling closures

I sometimes end up making sort of micro functions or extensions only ever used in one place, or otherwise making code messier, so it could be worth considering indeed.

Though since we can make that extension, I'm not sure valuable it actually is.

@ghost
Copy link

ghost commented Jan 13, 2025

@TekExplorer: our earlier discussions of block expressions hit the wall because no one could suggest the syntax of returning a value. The opinion (which I don't share, but it's immaterial) was that "return x" in a block expression signifies "return from the containing function" (as opposed to the expression itself). If so, "return a value from block expression" would need a different syntax, and no one has any idea what this syntax could be. This is an intractable problem that goes away as soon as we reclassify
"block expression" as "anonymous method". Then, being a method, it obviously returns via "return", as it's typical for any method.

@TekExplorer
Copy link

An interesting conundrum. We could go with an expression without a semicolon, like Rust does - though I'm not especially a fan of implicit returns, so I'm not sure.

We could also look to see if there's a more closure-like syntax so that return makes more sense. Not sure.

@eernstg
Copy link
Member Author

eernstg commented Jan 17, 2025

@tatumizer wrote:

The same can be expressed today by replacing .. with ;:

It does look similar. ;-)

But your code is not an expression, it's a local variable declaration followed by a block. So it will only fit into a statement list, not into all those other positions where you can have an expression.

Also, the type of the target object is Cons in the block, which means that you can't access the setter next (only ConsLate has that member).

Finally, you can't access members of the target object myCircle inside the block implicitly, so you'd have to use myCircle.next = cons, except that this also won't work because myCircle has type Cons.

So it's not exactly the same.

@TekExplorer wrote:

That example is also something of a code smell - I don't like how you're effectively mutating what is supposed to be a final late value "freely" outside of the class.

Good point!

However, we're building a cyclic structure and this means that there's no way we can build the entire cycle using final variables. So one of the links is a late final variable. It needs to be initialized somewhere.

It could be initialized in a constructor of ConsLate (otherwise it would certainly also be done "freely"), but it is not obvious that you'd want to have as many constructors as you'll have object graphs. You could use a callback in order to enable construction of structures that weren't foreseen by the authors of the ConsLate constructors:

abstract class Cons {
  final int i;
  Cons get next;
  const Cons._(this.i);
  factory Cons(int i, Cons next) = _ConsRegular;
}

class _ConsRegular extends Cons {
  final Cons next;
  const _ConsRegular(super.i, this.next) : super._();
}

class ConsLate extends Cons {
  late final Cons next;
  ConsLate(super.i, Cons builder(ConsLate self)) : super._() {
    next = builder(this);
  }
}

void main() {
  // Create a circular list.
  var myCircle = ConsLate(0, (self) {
    Cons cons = self;
    for (var i = 9; i > 0; --i) {
      cons = Cons(i, cons);
    }
    return cons;
  });

  // Use it.
  int counter = 0;
  Cons current = myCircle;
  while (++counter < 15) {
    print(current.i);
    current = current.next;
  }
}

We did manage to move the assignment to next in the ConsLate into the constructor, but I think it's going to be hard to write constructors which are sufficiently flexible to allow for this kind of callback based construction. The set of cyclic graphs is quite large, and you'd need to have multiple instances of ConsLate if you're going to build a graph that has more than one cycle.

@tatumizer wrote:

the idea may become more useful after a bit of generalization

It would be possible to use an anonymous method as a block expression: If you want to execute the statement S and return the expression e at a location where an expression is expected then you can use this.{ S; return e; }, and .{ S; return e; } could certainly have the same meaning.

However, we could also consider using .{ code } as a static access shorthand: It could mean T.{ code } where T is the declaration which is denoted by the given context type schema, which would allow the body to use static members of T by name without the need to have T every time this is done.

@TekExplorer wrote:

We could go with an expression without a semicolon, like Rust does

It would probably have to be the very last term in the block (and every other return must be explicit), but that is a quite attractive idea.

@ghost
Copy link

ghost commented Jan 18, 2025

Another (maybe better) option is not to define .{code} at all, and stick instead to the syntax x.{code} with the explicit receiver.
The point is that almost always, you can find a good candidate for receiver anyway - e.g. if the code naturally revolves around some x, use it as a receiver. In the rare case when such a thing doesn't exist, use _.{code}, which is an equivalent to this.{code}: because we don't have an override for this, the old "this" remains in effect.

The main attraction of extension methods is that they make it explicit that we are not talking about "block-expressions" (the latter would indeed require the support of return from the outer function, break from an outer labelled loop, new (confusing) second color of return and other complications).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature Proposed language feature that solves one or more problems
Projects
None yet
Development

No branches or pull requests

7 participants