@@ -38,14 +38,31 @@ A consumer may load the following info for a commit from the graph:
38
38
39
39
Values 1-4 satisfy the requirements of parse_commit_gently().
40
40
41
- Define the "generation number" of a commit recursively as follows:
41
+ There are two definitions of generation number:
42
+ 1. Corrected committer dates (generation number v2)
43
+ 2. Topological levels (generation nummber v1)
42
44
43
- * A commit with no parents (a root commit) has generation number one.
45
+ Define "corrected committer date" of a commit recursively as follows:
44
46
45
- * A commit with at least one parent has generation number one more than
46
- the largest generation number among its parents .
47
+ * A commit with no parents (a root commit) has corrected committer date
48
+ equal to its committer date .
47
49
48
- Equivalently, the generation number of a commit A is one more than the
50
+ * A commit with at least one parent has corrected committer date equal to
51
+ the maximum of its commiter date and one more than the largest corrected
52
+ committer date among its parents.
53
+
54
+ * As a special case, a root commit with timestamp zero has corrected commit
55
+ date of 1, to be able to distinguish it from GENERATION_NUMBER_ZERO
56
+ (that is, an uncomputed corrected commit date).
57
+
58
+ Define the "topological level" of a commit recursively as follows:
59
+
60
+ * A commit with no parents (a root commit) has topological level of one.
61
+
62
+ * A commit with at least one parent has topological level one more than
63
+ the largest topological level among its parents.
64
+
65
+ Equivalently, the topological level of a commit A is one more than the
49
66
length of a longest path from A to a root commit. The recursive definition
50
67
is easier to use for computation and observing the following property:
51
68
@@ -60,14 +77,19 @@ is easier to use for computation and observing the following property:
60
77
generation numbers, then we always expand the boundary commit with highest
61
78
generation number and can easily detect the stopping condition.
62
79
80
+ The property applies to both versions of generation number, that is both
81
+ corrected committer dates and topological levels.
82
+
63
83
This property can be used to significantly reduce the time it takes to
64
84
walk commits and determine topological relationships. Without generation
65
85
numbers, the general heuristic is the following:
66
86
67
87
If A and B are commits with commit time X and Y, respectively, and
68
88
X < Y, then A _probably_ cannot reach B.
69
89
70
- This heuristic is currently used whenever the computation is allowed to
90
+ In absence of corrected commit dates (for example, old versions of Git or
91
+ mixed generation graph chains),
92
+ this heuristic is currently used whenever the computation is allowed to
71
93
violate topological relationships due to clock skew (such as "git log"
72
94
with default order), but is not used when the topological order is
73
95
required (such as merge base calculations, "git log --graph").
@@ -77,7 +99,7 @@ in the commit graph. We can treat these commits as having "infinite"
77
99
generation number and walk until reaching commits with known generation
78
100
number.
79
101
80
- We use the macro GENERATION_NUMBER_INFINITY = 0xFFFFFFFF to mark commits not
102
+ We use the macro GENERATION_NUMBER_INFINITY to mark commits not
81
103
in the commit-graph file. If a commit-graph file was written by a version
82
104
of Git that did not compute generation numbers, then those commits will
83
105
have generation number represented by the macro GENERATION_NUMBER_ZERO = 0.
@@ -93,12 +115,12 @@ fully-computed generation numbers. Using strict inequality may result in
93
115
walking a few extra commits, but the simplicity in dealing with commits
94
116
with generation number *_INFINITY or *_ZERO is valuable.
95
117
96
- We use the macro GENERATION_NUMBER_MAX = 0x3FFFFFFF to for commits whose
97
- generation numbers are computed to be at least this value. We limit at
98
- this value since it is the largest value that can be stored in the
99
- commit-graph file using the 30 bits available to generation numbers. This
100
- presents another case where a commit can have generation number equal to
101
- that of a parent.
118
+ We use the macro GENERATION_NUMBER_V1_MAX = 0x3FFFFFFF for commits whose
119
+ topological levels ( generation number v1) are computed to be at least
120
+ this value. We limit at this value since it is the largest value that
121
+ can be stored in the commit-graph file using the 30 bits available
122
+ to topological levels. This presents another case where a commit can
123
+ have generation number equal to that of a parent.
102
124
103
125
Design Details
104
126
--------------
@@ -267,6 +289,35 @@ The merge strategy values (2 for the size multiple, 64,000 for the maximum
267
289
number of commits) could be extracted into config settings for full
268
290
flexibility.
269
291
292
+ ## Handling Mixed Generation Number Chains
293
+
294
+ With the introduction of generation number v2 and generation data chunk, the
295
+ following scenario is possible:
296
+
297
+ 1. "New" Git writes a commit-graph with the corrected commit dates.
298
+ 2. "Old" Git writes a split commit-graph on top without corrected commit dates.
299
+
300
+ A naive approach of using the newest available generation number from
301
+ each layer would lead to violated expectations: the lower layer would
302
+ use corrected commit dates which are much larger than the topological
303
+ levels of the higher layer. For this reason, Git inspects the topmost
304
+ layer to see if the layer is missing corrected commit dates. In such a case
305
+ Git only uses topological level for generation numbers.
306
+
307
+ When writing a new layer in split commit-graph, we write corrected commit
308
+ dates if the topmost layer has corrected commit dates written. This
309
+ guarantees that if a layer has corrected commit dates, all lower layers
310
+ must have corrected commit dates as well.
311
+
312
+ When merging layers, we do not consider whether the merged layers had corrected
313
+ commit dates. Instead, the new layer will have corrected commit dates if the
314
+ layer below the new layer has corrected commit dates.
315
+
316
+ While writing or merging layers, if the new layer is the only layer, it will
317
+ have corrected commit dates when written by compatible versions of Git. Thus,
318
+ rewriting split commit-graph as a single file (`--split=replace`) creates a
319
+ single layer with corrected commit dates.
320
+
270
321
## Deleting graph-{hash} files
271
322
272
323
After a new tip file is written, some `graph-{hash}` files may no longer
0 commit comments