|
| 1 | +- Feature Name: N/A |
| 2 | +- Start Date: 2015-07-06 |
| 3 | +- RFC PR: (leave this empty) |
| 4 | +- Rust Issue: (leave this empty) |
| 5 | + |
| 6 | + |
| 7 | +# Summary |
| 8 | + |
| 9 | +Add a high-level intermediate representation (HIR) to the compiler. This is |
| 10 | +basically a new (and additional) AST more suited for use by the compiler. |
| 11 | + |
| 12 | +This is purely an implementation detail of the compiler. It has no effect on the |
| 13 | +language. |
| 14 | + |
| 15 | +Note that adding a HIR does not preclude adding a MIR or LIR in the future. |
| 16 | + |
| 17 | + |
| 18 | +# Motivation |
| 19 | + |
| 20 | +Currently the AST is used by libsyntax for syntactic operations, by the compiler |
| 21 | +for pretty much everything, and in syntax extensions. I propose splitting the |
| 22 | +AST into a libsyntax version that is specialised for syntactic operation and |
| 23 | +will eventually be stabilised for use by syntax extensions and tools, and the |
| 24 | +HIR which is entirely internal to the compiler. |
| 25 | + |
| 26 | +The benefit of this split is that each AST can be specialised to its task and we |
| 27 | +can separate the interface to the compiler (the AST) from its implementation |
| 28 | +(the HIR). Specific changes I see that could happen are more ids and spans in |
| 29 | +the AST, the AST adhering more closely to the surface syntax, the HIR becoming |
| 30 | +more abstract (e.g., combining structs and enums), and using resolved names in |
| 31 | +the HIR (i.e., performing name resolution as part of the AST->HIR lowering). |
| 32 | + |
| 33 | +Not using the AST in the compiler means we can work to stabilise it for syntax |
| 34 | +extensions and tools: it will become part of the interface to the compiler. |
| 35 | + |
| 36 | +I also envisage all syntactic expansion of language constructs (e.g., `for` |
| 37 | +loops, `if let`) moving to the lowering step from AST to HIR, rather than being |
| 38 | +AST manipulations. That should make both error messages and tool support better |
| 39 | +for such constructs. It would be nice to move lifetime elision to the lowering |
| 40 | +step too, in order to make the HIR as explicit as possible. |
| 41 | + |
| 42 | + |
| 43 | +# Detailed design |
| 44 | + |
| 45 | +Initially, the HIR will be an (almost) identical copy of the AST and the |
| 46 | +lowering step will simply be a copy operation. Since some constructs (macros, |
| 47 | +`for` loops, etc.) are expanded away in libsyntax, these will not be part of the |
| 48 | +HIR. Tools such as the AST visitor will need to be duplicated. |
| 49 | + |
| 50 | +The compiler will be changed to use the HIR throughout (this should mostly be a |
| 51 | +matter of change the imports). Incrementally, I expect to move expansion of |
| 52 | +language constructs to the lowering step. Further in the future, the HIR should |
| 53 | +get more abstract and compact, and the AST should get closer to the surface |
| 54 | +syntax. |
| 55 | + |
| 56 | + |
| 57 | +# Drawbacks |
| 58 | + |
| 59 | +Potentially slower compilations and higher memory use. However, this should be |
| 60 | +offset in the long run by making improvements to the compiler easier by having a |
| 61 | +more appropriate data structure. |
| 62 | + |
| 63 | + |
| 64 | +# Alternatives |
| 65 | + |
| 66 | +Leave things as they are. |
| 67 | + |
| 68 | +Skip the HIR and lower straight to a MIR later in compilation. This has |
| 69 | +advantages which adding a HIR does not have, however, it is a far more complex |
| 70 | +refactoring and also misses some benefits of the HIR, notably being able to |
| 71 | +stabilise the AST for tools and syntax extensions without locking in the |
| 72 | +compiler. |
| 73 | + |
| 74 | + |
| 75 | +# Unresolved questions |
| 76 | + |
| 77 | +How to deal with spans and source code. We could keep the AST around and |
| 78 | +reference back to it from the HIR. Or we could copy span information to the HIR |
| 79 | +(I plan on doing this initially). Possibly some other solution like keeping the |
| 80 | +span info in a side table (note that we need less span info in the compiler than |
| 81 | +we do in libsyntax, which is in turn less than tools want). |
0 commit comments