Skip to content

Commit 4dd71ee

Browse files
dschomjcheetham
authored andcommitted
pack-objects: allow --path-walk with --shallow (#699)
This pull request aims to correct a pretty big issue when dealing with UNINTERESTING objects in the path-walk API. They somehow were only exposed when trying to perform a push from a shallow clone. This will require rewriting the upstream version so this is avoided from the start, but we can do a forward fix for now. The key issue is that the path-walk API was not walking UNINTERESTING trees at the right time, and the way it was being done was more complicated than it needed to be. This changes some of the way the path-walk API works in the presence of UNINTERSTING commits, but these are good changes to make. I had briefly attempted to remove the use of the `edge_aggressive` option in `struct path_walk_info` in favor of using the `--objects-edge-aggressive` option in the revision struct. When I started down that road, though, I somehow got myself into a bind of things not working correctly. I backed out to this version that is working with our test cases. I tested this using the thin and big pack tests in `p5313` which had the same performance as before this change. The new change is that in a shallow clone we can get the same `git push` improvements. I was hung up on testing this for a long time as I wasn't getting the same results in my shallow clone as in my regular clones. It turns out that I had forgotten to use `--no-reuse-delta` in my test command, so it was picking the deltas that were given by the initial clone instead of picking new ones per the algorithm. 🤦🏻
2 parents 91d1242 + 5a56826 commit 4dd71ee

File tree

5 files changed

+56
-6
lines changed

5 files changed

+56
-6
lines changed

Documentation/technical/api-path-walk.adoc

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -56,6 +56,14 @@ better off using the revision walk API instead.
5656
the revision walk so that the walk emits commits marked with the
5757
`UNINTERESTING` flag.
5858

59+
`edge_aggressive`::
60+
For performance reasons, usually only the boundary commits are
61+
explored to find UNINTERESTING objects. However, in the case of
62+
shallow clones it can be helpful to mark all trees and blobs
63+
reachable from UNINTERESTING tip commits as UNINTERESTING. This
64+
matches the behavior of `--objects-edge-aggressive` in the
65+
revision API.
66+
5967
`pl`::
6068
This pattern list pointer allows focusing the path-walk search to
6169
a set of patterns, only emitting paths that match the given

builtin/pack-objects.c

Lines changed: 2 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -203,6 +203,7 @@ static int keep_unreachable, unpack_unreachable, include_tag;
203203
static timestamp_t unpack_unreachable_expiration;
204204
static int pack_loose_unreachable;
205205
static int cruft;
206+
static int shallow = 0;
206207
static timestamp_t cruft_expiration;
207208
static int local;
208209
static int have_non_local_packs;
@@ -4477,6 +4478,7 @@ static void get_object_list_path_walk(struct rev_info *revs)
44774478
* base objects.
44784479
*/
44794480
info.prune_all_uninteresting = sparse;
4481+
info.edge_aggressive = shallow;
44804482

44814483
if (walk_objects_by_path(&info))
44824484
die(_("failed to pack objects via path-walk"));
@@ -4678,7 +4680,6 @@ int cmd_pack_objects(int argc,
46784680
struct repository *repo UNUSED)
46794681
{
46804682
int use_internal_rev_list = 0;
4681-
int shallow = 0;
46824683
int all_progress_implied = 0;
46834684
struct strvec rp = STRVEC_INIT;
46844685
int rev_list_unpacked = 0, rev_list_all = 0, rev_list_reflog = 0;
@@ -4866,10 +4867,6 @@ int cmd_pack_objects(int argc,
48664867
warning(_("cannot use delta islands with --path-walk"));
48674868
path_walk = 0;
48684869
}
4869-
if (path_walk && shallow) {
4870-
warning(_("cannot use --shallow with --path-walk"));
4871-
path_walk = 0;
4872-
}
48734870
if (path_walk) {
48744871
strvec_push(&rp, "--boundary");
48754872
/*

path-walk.c

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -503,7 +503,11 @@ int walk_objects_by_path(struct path_walk_info *info)
503503
if (prepare_revision_walk(info->revs))
504504
die(_("failed to setup revision walk"));
505505

506-
/* Walk trees to mark them as UNINTERESTING. */
506+
/*
507+
* Walk trees to mark them as UNINTERESTING.
508+
* This is particularly important when 'edge_aggressive' is set.
509+
*/
510+
info->revs->edge_hint_aggressive = info->edge_aggressive;
507511
edge_repo = info->revs->repo;
508512
edge_tree_list = root_tree_list;
509513
mark_edges_uninteresting(info->revs, show_edge,

path-walk.h

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -50,6 +50,13 @@ struct path_walk_info {
5050
*/
5151
int prune_all_uninteresting;
5252

53+
/**
54+
* When 'edge_aggressive' is set, then the revision walk will use
55+
* the '--object-edge-aggressive' option to mark even more objects
56+
* as uninteresting.
57+
*/
58+
int edge_aggressive;
59+
5360
/**
5461
* Specify a sparse-checkout definition to match our paths to. Do not
5562
* walk outside of this sparse definition. If the patterns are in

t/t5538-push-shallow.sh

Lines changed: 34 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -123,4 +123,38 @@ EOF
123123
git cat-file blob $(echo 1|git hash-object --stdin) >/dev/null
124124
)
125125
'
126+
127+
test_expect_success 'push new commit from shallow clone has correct object count' '
128+
git init origin &&
129+
test_commit -C origin a &&
130+
test_commit -C origin b &&
131+
132+
git clone --depth=1 "file://$(pwd)/origin" client &&
133+
git -C client checkout -b topic &&
134+
git -C client commit --allow-empty -m "empty" &&
135+
GIT_PROGRESS_DELAY=0 git -C client push --progress origin topic 2>err &&
136+
test_grep "Enumerating objects: 1, done." err
137+
'
138+
139+
test_expect_success 'push new commit from shallow clone has good deltas' '
140+
git init base &&
141+
test_seq 1 999 >base/a &&
142+
test_commit -C base initial &&
143+
git -C base add a &&
144+
git -C base commit -m "big a" &&
145+
146+
git clone --depth=1 "file://$(pwd)/base" deltas &&
147+
git -C deltas checkout -b deltas &&
148+
test_seq 1 1000 >deltas/a &&
149+
git -C deltas commit -a -m "bigger a" &&
150+
GIT_TRACE2_PERF="$(pwd)/trace.txt" \
151+
GIT_PROGRESS_DELAY=0 git -C deltas push --progress origin deltas 2>err &&
152+
153+
test_grep "Enumerating objects: 5, done" err &&
154+
155+
# If the delta base is found, then this message uses "bytes".
156+
# If the delta base is not found, then this message uses "KiB".
157+
test_grep "Writing objects: .* bytes" err
158+
'
159+
126160
test_done

0 commit comments

Comments
 (0)