-
Notifications
You must be signed in to change notification settings - Fork 214
Consider running macros in rounds instead of phases #3858
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
On that topic, why do we need phases to begin with? Any reason why a macro can't output everything it wants in a single phase ; and the next one picks it up based on order? |
Rounds are certainly another answer. Originally we chose not to do that approach because it means re-running macros within the same library often potentially, and you have to deal with infinite loops also. And macros will more often be running in an invalid/incomplete state as well, which means we have to specify exactly how that works as opposed to letting the tools decide what is best (CFE can just throw if you attempt to resolve a type and it can't find it). For the data model approach, rounds might make more sense, I am not sure.
The phase approach allows macros to be composed together better, without users having to carefully order macros. Consider a macro which generates fields based on a constructor signature, and then other macros that need to know all the fields of a class in order to generate the body of some method (maybe toString, toJson, fromJson, etc). Phases allow that in a totally safe and coherent way (as long as you don't need to see the fields to generate your API, there is still a tricky edge there where ordering matters). When it comes to resolving and understanding types, the phases provide a near perfect guarantee that you see the correct types (modulo shadowing via non-type declarations). |
Thanks! Then, would a "round" approach enable generating a new class per field in the annotated class ? I understand how phases help ensuring that types are resolved when manipulating them. But I think that this limitation around generating new classes or augmenting a class with new mixins is going to be limiting. |
Thanks Jake. @rrousselGit yes, in theory the "rounds" approach allows creating classes based on fields, generally it's "anything goes and reanalyze". Which is why work is needed to determine the downsides: for example if we can't make it produce deterministic output, that would be a problem :) |
I updated the exploratory code to talk explicitly about rounds; each macro gets one query result and can send one batch of augmentations per round, they're then written to disk, reanalysis happens, and macros run again if needed:
and then if I change a file there is a further round with generation followed by reanalyze and terminate.
of course it's interactions between macros that are more interesting, but I need to add some nontrivial macros first. |
One thought, the amount of time needed per round is critical, and it relates to "reanalysis time". It is possible that the phase limitations are exactly, or close to, what is needed to arrange that the host can do less "reanalysis" work and so the rounds can proceed quicker. |
Here's one example of a problem we encountered in another thread. (The problem is not necessarily relevant in real life, but it might be good as a motivating example). Given a class like Colors, the macro has to produce another class named, say,
|
Note that given @scheglov 's investigation into performance/cost of running macros, it leads me to believe that the rounds approach would potentially exacerbate the issue, by making us re-parse augmentation outputs an unbounded number of times, and the parsing/merging itself is a big portion of the overall cost? |
Could the parsing cost be offset by having a macro API where the macros would list new definitions? |
Yeah, could be there's no benefit: this is tagged for the "breaking changes" milestone to mean "either do it as part of that or decide that we won't do it". |
Investigation for #3706 suggests an alternative way to structure how macros run that we should consider.
The problem is how macro output affects other macros: what can macros see, and when?
I do not yet know the answer :) ... it is hard to make progress on this problem because in most realistic cases it doesn't matter: macros are often naturally independent, and where they overlap the user will probably chooses one or the other. So it is hard to motivate an answer.
Roughly, the phases approach focuses on either making macro applications independent or establishing an ordering between them, so that it is well-defined what each macro can see. In the declarations phase this includes building an ordering at runtime when one macro introspects on the output of another, and throwing if there is a cycle.
The alternative, that I'll call rounds, is to have no ordering between macros. (I believe "rounds" is the term the Java annotation processor uses, and there may be similarities, but I haven't looked at the Java details.)
Instead of an ordering: all macros see the same world state in round 1, then produce augmentations; then if needed they rerun (and any newly triggered macros run) against the round 2 state, and produce more/different augmentations; and so on, until no changes to augmentations are produced. Some rule is needed to force termination if it looks like no progress is being made.
Is this any better? As I said, I don't know :) but we now have an implementation that we can do experiments on to compare, I suggest we try to come up with motivating examples in order to make progress. We can also compare performance and how much work the macro author has to do.
@jakemac53 @scheglov @johnniwinther
The text was updated successfully, but these errors were encountered: