@@ -2,13 +2,12 @@ Region inference
2
2
3
3
# Terminology
4
4
5
- Note that we use the terms region and lifetime interchangeably,
6
- though the term ` lifetime ` is preferred.
5
+ Note that we use the terms region and lifetime interchangeably.
7
6
8
7
# Introduction
9
8
10
9
Region inference uses a somewhat more involved algorithm than type
11
- inference. It is not the most efficient thing ever written though it
10
+ inference. It is not the most efficient thing ever written though it
12
11
seems to work well enough in practice (famous last words). The reason
13
12
that we use a different algorithm is because, unlike with types, it is
14
13
impractical to hand-annotate with regions (in some cases, there aren't
@@ -25,22 +24,42 @@ once.
25
24
26
25
The constraints are always of one of three possible forms:
27
26
28
- - ConstrainVarSubVar(R_i, R_j) states that region variable R_i
29
- must be a subregion of R_j
30
- - ConstrainRegSubVar(R, R_i) states that the concrete region R
31
- (which must not be a variable) must be a subregion of the variable R_i
32
- - ConstrainVarSubReg(R_i, R) is the inverse
27
+ - ` ConstrainVarSubVar(Ri, Rj) ` states that region variable Ri must be
28
+ a subregion of Rj
29
+ - ` ConstrainRegSubVar(R, Ri) ` states that the concrete region R (which
30
+ must not be a variable) must be a subregion of the variable Ri
31
+ - ` ConstrainVarSubReg(Ri, R) ` states the variable Ri shoudl be less
32
+ than the concrete region R. This is kind of deprecated and ought to
33
+ be replaced with a verify (they essentially play the same role).
34
+
35
+ In addition to constraints, we also gather up a set of "verifys"
36
+ (what, you don't think Verify is a noun? Get used to it my
37
+ friend!). These represent relations that must hold but which don't
38
+ influence inference proper. These take the form of:
39
+
40
+ - ` VerifyRegSubReg(Ri, Rj) ` indicates that Ri <= Rj must hold,
41
+ where Rj is not an inference variable (and Ri may or may not contain
42
+ one). This doesn't influence inference because we will already have
43
+ inferred Ri to be as small as possible, so then we just test whether
44
+ that result was less than Rj or not.
45
+ - ` VerifyGenericBound(R, Vb) ` is a more complex expression which tests
46
+ that the region R must satisfy the bound ` Vb ` . The bounds themselves
47
+ may have structure like "must outlive one of the following regions"
48
+ or "must outlive ALL of the following regions. These bounds arise
49
+ from constraints like ` T: 'a ` -- if we know that ` T: 'b ` and ` T: 'c `
50
+ (say, from where clauses), then we can conclude that ` T: 'a ` if `'b:
51
+ 'a` *or* ` 'c: 'a`.
33
52
34
53
# Building up the constraints
35
54
36
55
Variables and constraints are created using the following methods:
37
56
38
57
- ` new_region_var() ` creates a new, unconstrained region variable;
39
- - ` make_subregion(R_i, R_j ) ` states that R_i is a subregion of R_j
40
- - ` lub_regions(R_i, R_j ) -> R_k ` returns a region R_k which is
41
- the smallest region that is greater than both R_i and R_j
42
- - ` glb_regions(R_i, R_j ) -> R_k ` returns a region R_k which is
43
- the greatest region that is smaller than both R_i and R_j
58
+ - ` make_subregion(Ri, Rj ) ` states that Ri is a subregion of Rj
59
+ - ` lub_regions(Ri, Rj ) -> Rk ` returns a region Rk which is
60
+ the smallest region that is greater than both Ri and Rj
61
+ - ` glb_regions(Ri, Rj ) -> Rk ` returns a region Rk which is
62
+ the greatest region that is smaller than both Ri and Rj
44
63
45
64
The actual region resolution algorithm is not entirely
46
65
obvious, though it is also not overly complex.
@@ -54,14 +73,6 @@ Alternatively, you can call `commit()` which ends all snapshots.
54
73
Snapshots can be recursive---so you can start a snapshot when another
55
74
is in progress, but only the root snapshot can "commit".
56
75
57
- # Resolving constraints
58
-
59
- The constraint resolution algorithm is not super complex but also not
60
- entirely obvious. Here I describe the problem somewhat abstractly,
61
- then describe how the current code works. There may be other, smarter
62
- ways of doing this with which I am unfamiliar and can't be bothered to
63
- research at the moment. - NDM
64
-
65
76
## The problem
66
77
67
78
Basically our input is a directed graph where nodes can be divided
@@ -83,31 +94,20 @@ Before resolution begins, we build up the constraints in a hashmap
83
94
that maps ` Constraint ` keys to spans. During resolution, we construct
84
95
the actual ` Graph ` structure that we describe here.
85
96
86
- ## Our current algorithm
87
-
88
- We divide region variables into two groups: Expanding and Contracting.
89
- Expanding region variables are those that have a concrete region
90
- predecessor (direct or indirect). Contracting region variables are
91
- all others.
92
-
93
- We first resolve the values of Expanding region variables and then
94
- process Contracting ones. We currently use an iterative, fixed-point
95
- procedure (but read on, I believe this could be replaced with a linear
96
- walk). Basically we iterate over the edges in the graph, ensuring
97
- that, if the source of the edge has a value, then this value is a
98
- subregion of the target value. If the target does not yet have a
99
- value, it takes the value from the source. If the target already had
100
- a value, then the resulting value is Least Upper Bound of the old and
101
- new values. When we are done, each Expanding node will have the
102
- smallest region that it could possibly have and still satisfy the
103
- constraints.
104
-
105
- We next process the Contracting nodes. Here we again iterate over the
106
- edges, only this time we move values from target to source (if the
107
- source is a Contracting node). For each contracting node, we compute
108
- its value as the GLB of all its successors. Basically contracting
109
- nodes ensure that there is overlap between their successors; we will
110
- ultimately infer the largest overlap possible.
97
+ ## Computing the values for region variables
98
+
99
+ The algorithm is a simple dataflow algorithm. Each region variable
100
+ begins as empty. We iterate over the constraints, and for each constraint
101
+ we grow the relevant region variable to be as big as it must be to meet all the
102
+ constraints. This means the region variables can grow to be ` 'static ` if
103
+ necessary.
104
+
105
+ ## Verification
106
+
107
+ After all constraints are fully propoagated, we do a "verification"
108
+ step where we walk over the verify bounds and check that they are
109
+ satisfied. These bounds represent the "maximal" values that a region
110
+ variable can take on, basically.
111
111
112
112
# The Region Hierarchy
113
113
0 commit comments