Skip to content

Consider running macros in rounds instead of phases #3858

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
davidmorgan opened this issue May 31, 2024 · 10 comments
Closed

Consider running macros in rounds instead of phases #3858

davidmorgan opened this issue May 31, 2024 · 10 comments
Assignees
Labels
static-metaprogramming Issues related to static metaprogramming

Comments

@davidmorgan
Copy link
Contributor

Investigation for #3706 suggests an alternative way to structure how macros run that we should consider.

The problem is how macro output affects other macros: what can macros see, and when?

I do not yet know the answer :) ... it is hard to make progress on this problem because in most realistic cases it doesn't matter: macros are often naturally independent, and where they overlap the user will probably chooses one or the other. So it is hard to motivate an answer.

Roughly, the phases approach focuses on either making macro applications independent or establishing an ordering between them, so that it is well-defined what each macro can see. In the declarations phase this includes building an ordering at runtime when one macro introspects on the output of another, and throwing if there is a cycle.

The alternative, that I'll call rounds, is to have no ordering between macros. (I believe "rounds" is the term the Java annotation processor uses, and there may be similarities, but I haven't looked at the Java details.)

Instead of an ordering: all macros see the same world state in round 1, then produce augmentations; then if needed they rerun (and any newly triggered macros run) against the round 2 state, and produce more/different augmentations; and so on, until no changes to augmentations are produced. Some rule is needed to force termination if it looks like no progress is being made.

Is this any better? As I said, I don't know :) but we now have an implementation that we can do experiments on to compare, I suggest we try to come up with motivating examples in order to make progress. We can also compare performance and how much work the macro author has to do.

@jakemac53 @scheglov @johnniwinther

@rrousselGit
Copy link

On that topic, why do we need phases to begin with? Any reason why a macro can't output everything it wants in a single phase ; and the next one picks it up based on order?

@jakemac53
Copy link
Contributor

Rounds are certainly another answer. Originally we chose not to do that approach because it means re-running macros within the same library often potentially, and you have to deal with infinite loops also. And macros will more often be running in an invalid/incomplete state as well, which means we have to specify exactly how that works as opposed to letting the tools decide what is best (CFE can just throw if you attempt to resolve a type and it can't find it).

For the data model approach, rounds might make more sense, I am not sure.

On that topic, why do we need phases to begin with? Any reason why a macro can't output everything it wants in a single phase ; and the next one picks it up based on order?

The phase approach allows macros to be composed together better, without users having to carefully order macros.

Consider a macro which generates fields based on a constructor signature, and then other macros that need to know all the fields of a class in order to generate the body of some method (maybe toString, toJson, fromJson, etc). Phases allow that in a totally safe and coherent way (as long as you don't need to see the fields to generate your API, there is still a tricky edge there where ordering matters).

When it comes to resolving and understanding types, the phases provide a near perfect guarantee that you see the correct types (modulo shadowing via non-type declarations).

@rrousselGit
Copy link

Thanks! Then, would a "round" approach enable generating a new class per field in the annotated class ?

I understand how phases help ensuring that types are resolved when manipulating them. But I think that this limitation around generating new classes or augmenting a class with new mixins is going to be limiting.

@davidmorgan
Copy link
Contributor Author

Thanks Jake.

@rrousselGit yes, in theory the "rounds" approach allows creating classes based on fields, generally it's "anything goes and reanalyze".

Which is why work is needed to determine the downsides: for example if we can't make it produce deterministic output, that would be a problem :)

@davidmorgan
Copy link
Contributor Author

I updated the exploratory code to talk explicitly about rounds; each macro gets one query result and can send one batch of augmentations per round, they're then written to disk, reanalysis happens, and macros run again if needed:

1 file(s) changed: (/tmp/dart_model_benchmark/dartModel/package_under_test/lib/a0.dart)
Requeried in 3452ms.
Entered round 2, sending to 3 watches.
  EqualsMacro augments 1 uri(s), 30288 char(s), round 2.
  HashCodeMacro augments 1 uri(s), 15697 char(s), round 2.
  ToStringMacro augments 1 uri(s), 11800 char(s), round 2.
Write: /tmp/dart_model_benchmark/dartModel/package_under_test/lib/a0.a.dart
1 file(s) changed: (/tmp/dart_model_benchmark/dartModel/package_under_test/lib/a0.a.dart)
Requeried in 2898ms.
No changes relevant to macros, not starting new round.

and then if I change a file there is a further round with generation followed by reanalyze and terminate.

1 file(s) changed: (/tmp/dart_model_benchmark/dartModel/package_under_test/lib/a0.dart)
Requeried in 3338ms.
Entered round 3, sending to 3 watches.
  EqualsMacro augments 1 uri(s), 30259 char(s), round 3.
  HashCodeMacro augments 1 uri(s), 15682 char(s), round 3.
  ToStringMacro augments 1 uri(s), 11789 char(s), round 3.
Write: /tmp/dart_model_benchmark/dartModel/package_under_test/lib/a0.a.dart
1 file(s) changed: (/tmp/dart_model_benchmark/dartModel/package_under_test/lib/a0.a.dart)
Requeried in 2893ms.
No changes relevant to macros, not starting new round.

of course it's interactions between macros that are more interesting, but I need to add some nontrivial macros first.

@davidmorgan davidmorgan self-assigned this May 31, 2024
@davidmorgan
Copy link
Contributor Author

One thought, the amount of time needed per round is critical, and it relates to "reanalysis time".

It is possible that the phase limitations are exactly, or close to, what is needed to arrange that the host can do less "reanalysis" work and so the rounds can proceed quicker.

@ghost
Copy link

ghost commented May 31, 2024

Here's one example of a problem we encountered in another thread. (The problem is not necessarily relevant in real life, but it might be good as a motivating example).

Given a class like Colors, the macro has to produce another class named, say, MaterialColorSet (in fact, it was supposed to be an extension, but let it be a class for simplicity) that contains only the copies of constants assignable to MaterialColor (as you can see, not all colors in Colors are MaterialColors).
Is it possible to write such a macro

  • with the current macro design
  • with the "rounds" design

@davidmorgan davidmorgan moved this from In Progress to Todo in Static Metaprogramming - design/prototype Jun 10, 2024
@jakemac53
Copy link
Contributor

Note that given @scheglov 's investigation into performance/cost of running macros, it leads me to believe that the rounds approach would potentially exacerbate the issue, by making us re-parse augmentation outputs an unbounded number of times, and the parsing/merging itself is a big portion of the overall cost?

@rrousselGit
Copy link

Could the parsing cost be offset by having a macro API where the macros would list new definitions?
Regardless of the exact syntax, say we used something like code_builder to emit augmentations. Wouldn't we be able to know if a new field is added in a class, without having to reparse anything?

@davidmorgan
Copy link
Contributor Author

Yeah, could be there's no benefit: this is tagged for the "breaking changes" milestone to mean "either do it as part of that or decide that we won't do it".

@davidmorgan davidmorgan closed this as not planned Won't fix, can't repro, duplicate, stale Feb 3, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
static-metaprogramming Issues related to static metaprogramming
Projects
Development

No branches or pull requests

3 participants