Skip to content

Commit d638de2

Browse files
committed
Merge branch 'ds/path-walk-2' into jch
"git pack-objects" learns to find delta bases from blobs at the same path, using the --path-walk API. * ds/path-walk-2: pack-objects: allow --shallow and --path-walk path-walk: add new 'edge_aggressive' option pack-objects: thread the path-based compression pack-objects: refactor path-walk delta phase scalar: enable path-walk during push via config pack-objects: enable --path-walk via config repack: add --path-walk option t5538: add tests to confirm deltas in shallow pushes pack-objects: introduce GIT_TEST_PACK_PATH_WALK p5313: add performance tests for --path-walk pack-objects: update usage to match docs pack-objects: add --path-walk option pack-objects: extract should_attempt_deltas()
2 parents a75d91d + fd152fe commit d638de2

26 files changed

+605
-67
lines changed

Documentation/config/feature.adoc

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -20,6 +20,10 @@ walking fewer objects.
2020
+
2121
* `pack.allowPackReuse=multi` may improve the time it takes to create a pack by
2222
reusing objects from multiple packs instead of just one.
23+
+
24+
* `pack.usePathWalk` may speed up packfile creation and make the packfiles be
25+
significantly smaller in the presence of certain filename collisions with Git's
26+
default name-hash.
2327
2428
feature.manyFiles::
2529
Enable config options that optimize for repos with many files in the

Documentation/config/pack.adoc

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -155,6 +155,14 @@ pack.useSparse::
155155
commits contain certain types of direct renames. Default is
156156
`true`.
157157

158+
pack.usePathWalk::
159+
When true, git will default to using the '--path-walk' option in
160+
'git pack-objects' when the '--revs' option is present. This
161+
algorithm groups objects by path to maximize the ability to
162+
compute delta chains across historical versions of the same
163+
object. This may disable other options, such as using bitmaps to
164+
enumerate objects.
165+
158166
pack.preferBitmapTips::
159167
When selecting which commits will receive bitmaps, prefer a
160168
commit at the tip of any reference that is a suffix of any value

Documentation/git-pack-objects.adoc

Lines changed: 19 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -10,13 +10,13 @@ SYNOPSIS
1010
--------
1111
[verse]
1212
'git pack-objects' [-q | --progress | --all-progress] [--all-progress-implied]
13-
[--no-reuse-delta] [--delta-base-offset] [--non-empty]
14-
[--local] [--incremental] [--window=<n>] [--depth=<n>]
15-
[--revs [--unpacked | --all]] [--keep-pack=<pack-name>]
16-
[--cruft] [--cruft-expiration=<time>]
17-
[--stdout [--filter=<filter-spec>] | <base-name>]
18-
[--shallow] [--keep-true-parents] [--[no-]sparse]
19-
[--name-hash-version=<n>] < <object-list>
13+
[--no-reuse-delta] [--delta-base-offset] [--non-empty]
14+
[--local] [--incremental] [--window=<n>] [--depth=<n>]
15+
[--revs [--unpacked | --all]] [--keep-pack=<pack-name>]
16+
[--cruft] [--cruft-expiration=<time>]
17+
[--stdout [--filter=<filter-spec>] | <base-name>]
18+
[--shallow] [--keep-true-parents] [--[no-]sparse]
19+
[--name-hash-version=<n>] [--path-walk] < <object-list>
2020

2121

2222
DESCRIPTION
@@ -375,6 +375,18 @@ many different directories. At the moment, this version is not allowed
375375
when writing reachability bitmap files with `--write-bitmap-index` and it
376376
will be automatically changed to version `1`.
377377
378+
--path-walk::
379+
By default, `git pack-objects` walks objects in an order that
380+
presents trees and blobs in an order unrelated to the path they
381+
appear relative to a commit's root tree. The `--path-walk` option
382+
enables a different walking algorithm that organizes trees and
383+
blobs by path. This has the potential to improve delta compression
384+
especially in the presence of filenames that cause collisions in
385+
Git's default name-hash algorithm. Due to changing how the objects
386+
are walked, this option is not compatible with `--delta-islands`,
387+
`--shallow`, or `--filter`. The `--use-bitmap-index` option will
388+
be ignored in the presence of `--path-walk.`
389+
378390
379391
DELTA ISLANDS
380392
-------------

Documentation/git-repack.adoc

Lines changed: 13 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@ SYNOPSIS
1111
[verse]
1212
'git repack' [-a] [-A] [-d] [-f] [-F] [-l] [-n] [-q] [-b] [-m]
1313
[--window=<n>] [--depth=<n>] [--threads=<n>] [--keep-pack=<pack-name>]
14-
[--write-midx] [--name-hash-version=<n>]
14+
[--write-midx] [--name-hash-version=<n>] [--path-walk]
1515

1616
DESCRIPTION
1717
-----------
@@ -258,6 +258,18 @@ linkgit:git-multi-pack-index[1]).
258258
Provide this argument to the underlying `git pack-objects` process.
259259
See linkgit:git-pack-objects[1] for full details.
260260

261+
--path-walk::
262+
This option passes the `--path-walk` option to the underlying
263+
`git pack-options` process (see linkgit:git-pack-objects[1]).
264+
By default, `git pack-objects` walks objects in an order that
265+
presents trees and blobs in an order unrelated to the path they
266+
appear relative to a commit's root tree. The `--path-walk` option
267+
enables a different walking algorithm that organizes trees and
268+
blobs by path. This has the potential to improve delta compression
269+
especially in the presence of filenames that cause collisions in
270+
Git's default name-hash algorithm. Due to changing how the objects
271+
are walked, this option is not compatible with `--delta-islands`
272+
or `--filter`.
261273

262274
CONFIGURATION
263275
-------------

Documentation/technical/api-path-walk.adoc

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -56,6 +56,14 @@ better off using the revision walk API instead.
5656
the revision walk so that the walk emits commits marked with the
5757
`UNINTERESTING` flag.
5858

59+
`edge_aggressive`::
60+
For performance reasons, usually only the boundary commits are
61+
explored to find UNINTERESTING objects. However, in the case of
62+
shallow clones it can be helpful to mark all trees and blobs
63+
reachable from UNINTERESTING tip commits as UNINTERESTING. This
64+
matches the behavior of `--objects-edge-aggressive` in the
65+
revision API.
66+
5967
`pl`::
6068
This pattern list pointer allows focusing the path-walk search to
6169
a set of patterns, only emitting paths that match the given
@@ -69,4 +77,5 @@ Examples
6977

7078
See example usages in:
7179
`t/helper/test-path-walk.c`,
80+
`builtin/pack-objects.c`,
7281
`builtin/backfill.c`

0 commit comments

Comments
 (0)