Consider running macros in rounds instead of phases #3858

davidmorgan · 2024-05-31T08:30:34Z

Investigation for #3706 suggests an alternative way to structure how macros run that we should consider.

The problem is how macro output affects other macros: what can macros see, and when?

I do not yet know the answer :) ... it is hard to make progress on this problem because in most realistic cases it doesn't matter: macros are often naturally independent, and where they overlap the user will probably chooses one or the other. So it is hard to motivate an answer.

Roughly, the phases approach focuses on either making macro applications independent or establishing an ordering between them, so that it is well-defined what each macro can see. In the declarations phase this includes building an ordering at runtime when one macro introspects on the output of another, and throwing if there is a cycle.

The alternative, that I'll call rounds, is to have no ordering between macros. (I believe "rounds" is the term the Java annotation processor uses, and there may be similarities, but I haven't looked at the Java details.)

Instead of an ordering: all macros see the same world state in round 1, then produce augmentations; then if needed they rerun (and any newly triggered macros run) against the round 2 state, and produce more/different augmentations; and so on, until no changes to augmentations are produced. Some rule is needed to force termination if it looks like no progress is being made.

Is this any better? As I said, I don't know :) but we now have an implementation that we can do experiments on to compare, I suggest we try to come up with motivating examples in order to make progress. We can also compare performance and how much work the macro author has to do.

@jakemac53 @scheglov @johnniwinther

rrousselGit · 2024-05-31T10:00:54Z

On that topic, why do we need phases to begin with? Any reason why a macro can't output everything it wants in a single phase ; and the next one picks it up based on order?

jakemac53 · 2024-05-31T14:35:09Z

Rounds are certainly another answer. Originally we chose not to do that approach because it means re-running macros within the same library often potentially, and you have to deal with infinite loops also. And macros will more often be running in an invalid/incomplete state as well, which means we have to specify exactly how that works as opposed to letting the tools decide what is best (CFE can just throw if you attempt to resolve a type and it can't find it).

For the data model approach, rounds might make more sense, I am not sure.

On that topic, why do we need phases to begin with? Any reason why a macro can't output everything it wants in a single phase ; and the next one picks it up based on order?

The phase approach allows macros to be composed together better, without users having to carefully order macros.

Consider a macro which generates fields based on a constructor signature, and then other macros that need to know all the fields of a class in order to generate the body of some method (maybe toString, toJson, fromJson, etc). Phases allow that in a totally safe and coherent way (as long as you don't need to see the fields to generate your API, there is still a tricky edge there where ordering matters).

When it comes to resolving and understanding types, the phases provide a near perfect guarantee that you see the correct types (modulo shadowing via non-type declarations).

rrousselGit · 2024-05-31T14:40:34Z

Thanks! Then, would a "round" approach enable generating a new class per field in the annotated class ?

I understand how phases help ensuring that types are resolved when manipulating them. But I think that this limitation around generating new classes or augmenting a class with new mixins is going to be limiting.

davidmorgan · 2024-05-31T14:51:38Z

Thanks Jake.

@rrousselGit yes, in theory the "rounds" approach allows creating classes based on fields, generally it's "anything goes and reanalyze".

Which is why work is needed to determine the downsides: for example if we can't make it produce deterministic output, that would be a problem :)

davidmorgan · 2024-05-31T15:05:13Z

I updated the exploratory code to talk explicitly about rounds; each macro gets one query result and can send one batch of augmentations per round, they're then written to disk, reanalysis happens, and macros run again if needed:

1 file(s) changed: (/tmp/dart_model_benchmark/dartModel/package_under_test/lib/a0.dart)
Requeried in 3452ms.
Entered round 2, sending to 3 watches.
  EqualsMacro augments 1 uri(s), 30288 char(s), round 2.
  HashCodeMacro augments 1 uri(s), 15697 char(s), round 2.
  ToStringMacro augments 1 uri(s), 11800 char(s), round 2.
Write: /tmp/dart_model_benchmark/dartModel/package_under_test/lib/a0.a.dart
1 file(s) changed: (/tmp/dart_model_benchmark/dartModel/package_under_test/lib/a0.a.dart)
Requeried in 2898ms.
No changes relevant to macros, not starting new round.

and then if I change a file there is a further round with generation followed by reanalyze and terminate.

1 file(s) changed: (/tmp/dart_model_benchmark/dartModel/package_under_test/lib/a0.dart)
Requeried in 3338ms.
Entered round 3, sending to 3 watches.
  EqualsMacro augments 1 uri(s), 30259 char(s), round 3.
  HashCodeMacro augments 1 uri(s), 15682 char(s), round 3.
  ToStringMacro augments 1 uri(s), 11789 char(s), round 3.
Write: /tmp/dart_model_benchmark/dartModel/package_under_test/lib/a0.a.dart
1 file(s) changed: (/tmp/dart_model_benchmark/dartModel/package_under_test/lib/a0.a.dart)
Requeried in 2893ms.
No changes relevant to macros, not starting new round.

of course it's interactions between macros that are more interesting, but I need to add some nontrivial macros first.

davidmorgan · 2024-05-31T15:54:21Z

One thought, the amount of time needed per round is critical, and it relates to "reanalysis time".

It is possible that the phase limitations are exactly, or close to, what is needed to arrange that the host can do less "reanalysis" work and so the rounds can proceed quicker.

ghost · 2024-05-31T16:37:58Z

Here's one example of a problem we encountered in another thread. (The problem is not necessarily relevant in real life, but it might be good as a motivating example).

Given a class like Colors, the macro has to produce another class named, say, MaterialColorSet (in fact, it was supposed to be an extension, but let it be a class for simplicity) that contains only the copies of constants assignable to MaterialColor (as you can see, not all colors in Colors are MaterialColors).
Is it possible to write such a macro

with the current macro design
with the "rounds" design

jakemac53 · 2024-06-28T17:13:16Z

Note that given @scheglov 's investigation into performance/cost of running macros, it leads me to believe that the rounds approach would potentially exacerbate the issue, by making us re-parse augmentation outputs an unbounded number of times, and the parsing/merging itself is a big portion of the overall cost?

rrousselGit · 2024-06-28T18:05:29Z

Could the parsing cost be offset by having a macro API where the macros would list new definitions?
Regardless of the exact syntax, say we used something like code_builder to emit augmentations. Wouldn't we be able to know if a new field is added in a class, without having to reparse anything?

davidmorgan · 2024-07-01T08:11:55Z

Yeah, could be there's no benefit: this is tagged for the "breaking changes" milestone to mean "either do it as part of that or decide that we won't do it".

davidmorgan added the static-metaprogramming Issues related to static metaprogramming label May 31, 2024

davidmorgan added this to Static Metaprogramming - design/prototype May 31, 2024

github-project-automation bot moved this to Todo in Static Metaprogramming - design/prototype May 31, 2024

davidmorgan moved this from Todo to In Progress in Static Metaprogramming - design/prototype May 31, 2024

davidmorgan mentioned this issue May 31, 2024

[macros] Split generation explicitly into numbered rounds. #3861

Merged

davidmorgan self-assigned this May 31, 2024

davidmorgan moved this from In Progress to Todo in Static Metaprogramming - design/prototype Jun 10, 2024

davidmorgan closed this as not planned Won't fix, can't repro, duplicate, stale Feb 3, 2025

github-project-automation bot moved this from Todo to Done in Static Metaprogramming - design/prototype Feb 3, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Consider running macros in rounds instead of phases #3858

Consider running macros in rounds instead of phases #3858

davidmorgan commented May 31, 2024

rrousselGit commented May 31, 2024

Uh oh!

jakemac53 commented May 31, 2024

Uh oh!

rrousselGit commented May 31, 2024

Uh oh!

davidmorgan commented May 31, 2024

Uh oh!

davidmorgan commented May 31, 2024

Uh oh!

davidmorgan commented May 31, 2024

Uh oh!

ghost commented May 31, 2024

Uh oh!

jakemac53 commented Jun 28, 2024

Uh oh!

rrousselGit commented Jun 28, 2024

Uh oh!

davidmorgan commented Jul 1, 2024

Uh oh!

Consider running macros in rounds instead of phases #3858

Consider running macros in rounds instead of phases #3858

Comments

davidmorgan commented May 31, 2024

rrousselGit commented May 31, 2024

Uh oh!

jakemac53 commented May 31, 2024

Uh oh!

rrousselGit commented May 31, 2024

Uh oh!

davidmorgan commented May 31, 2024

Uh oh!

davidmorgan commented May 31, 2024

Uh oh!

davidmorgan commented May 31, 2024

Uh oh!

ghost commented May 31, 2024

Uh oh!

jakemac53 commented Jun 28, 2024

Uh oh!

rrousselGit commented Jun 28, 2024

Uh oh!

davidmorgan commented Jul 1, 2024

Uh oh!