Skip to content

Rebase to v2.20.1#95

Merged
derrickstolee merged 93 commits into
microsoft:vfs-2.20.1from
dscho:vfs-2.20.1
Dec 17, 2018
Merged

Rebase to v2.20.1#95
derrickstolee merged 93 commits into
microsoft:vfs-2.20.1from
dscho:vfs-2.20.1

Conversation

@dscho
Copy link
Copy Markdown
Member

@dscho dscho commented Dec 15, 2018

This is the relatively straight-forward outcome of git rebase -i --rebase-merges=rebase-cousins v2.20.1.windows.1

Ben Peart and others added 30 commits December 15, 2018 19:36
This is just a bug fix to git so that the pager won't close stdin/out
before other atexit functions run. The easy way to repro the bug is to
turn on GIT_TRACE_PERFORMANCE and run a command that runs the pager. Then
notice you don't get your performance data at the end. With this fix, you
do actually get the performance trace data.

Signed-off-by: Ben Peart <Ben.Peart@microsoft.com>
While using the reset --stdin feature on windows path added may have a
\r at the end of the path that wasn't getting removed so didn't match
the path in the index and wasn't reset.

Signed-off-by: Kevin Willford <kewillf@microsoft.com>
Signed-off-by: Saeed Noursalehi <sanoursa@microsoft.com>
Signed-off-by: Johannes Schindelin <johasc@microsoft.com>
This header file will accumulate GVFS-specific definitions.

Signed-off-by: Kevin Willford <kewillf@microsoft.com>
This does not do anything yet. The next patches will add various values
for that config setting that correspond to the various features
offered/required by GVFS.

Signed-off-by: Kevin Willford <kewillf@microsoft.com>
This takes a substantial amount of time, and if the user is reasonably
sure that the files' integrity is not compromised, that time can be saved.

Git no longer verifies the SHA-1 by default, anyway.

Signed-off-by: Kevin Willford <kewillf@microsoft.com>
Signed-off-by: Kevin Willford <kewillf@microsoft.com>
Prevent the sparse checkout to delete files that were marked with
skip-worktree bit and are not in the sparse-checkout file.

This is because everything with the skip-worktree bit turned on is being
virtualized and will be removed with the change of HEAD.

There was only one failing test when running with these changes that was
checking to make sure the worktree narrows on checkout which was
expected since we would no longer be narrowing the worktree.

Signed-off-by: Kevin Willford <kewillf@microsoft.com>
While performing a fetch with a virtual file system we know that there
will be missing objects and we don't want to download them just because
of the reachability of the commits.  We also don't want to download a
pack file with commits, trees, and blobs since these will be downloaded
on demand.

This flag will skip the first connectivity check and by returning zero
will skip the upload pack. It will also skip the second connectivity
check but continue to update the branches to the latest commit ids.

Signed-off-by: Kevin Willford <kewillf@microsoft.com>
The two existing members of the run_hook*() family, run_hook_ve() and
run_hook_le(), are good for callers that know the precise number of
parameters already. Let's introduce a new sibling that takes an argv
array for callers that want to pass a variable number of parameters.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Ensure all filters and EOL conversions are blocked when running under
GVFS so that our projected file sizes will match the actual file size
when it is hydrated on the local machine.

Signed-off-by: Ben Peart <Ben.Peart@microsoft.com>
The idea is to allow blob objects to be missing from the local repository,
and to load them lazily on demand.

After discussing this idea on the mailing list, we will rename the feature
to "lazy clone" and work more on this.

Signed-off-by: Ben Peart <Ben.Peart@microsoft.com>
This adds hard-coded call to GVFS.hooks.exe before and after each Git
command runs.

To make sure that this is only called on repositories cloned with GVFS, we
test for the tell-tale .gvfs.

Signed-off-by: Ben Peart <Ben.Peart@microsoft.com>
Hydrate missing loose objects in check_and_freshen() when running
virtualized. Add test cases to verify read-object hook works when
running virtualized.

This hook is called in check_and_freshen() rather than
check_and_freshen_local() to make the hook work also with alternates.

Helped-by: Kevin Willford <kewillf@microsoft.com>
Signed-off-by: Ben Peart <Ben.Peart@microsoft.com>
The use case here is to allow usage statistics to be gathered by
running hooks before and after every hook, and to make that
configurable via hooks.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
…ng objects

This commit converts the existing read_object hook proc model for
downloading missing blobs to use a background process that is started
the first time git encounters a missing blob and stays running until git
exits.  Git and the read-object process communicate via stdin/stdout and
a versioned, capability negotiated interface as documented in
Documentation/technical/read-object-protocol.txt.  The advantage of this
over the previous hook proc is that it saves the overhead of spawning a
new hook process for every missing blob.

The model for the background process was refactored from the recent git
LFS work.  I refactored that code into a shared module (sub-process.c/h)
and then updated convert.c to consume the new library.  I then used the
same sub-process module when implementing the read-object background
process.

Signed-off-by: Ben Peart <Ben.Peart@microsoft.com>
GVFS Git introduced pre-command and post-command hooks, to gather usage
statistics and to be able to adjust the worktree if necessary.

As run_hooks() implicitly calls setup_git_directory(), and that
function does surprising things to the global state (sometimes even
changing the current working directory), it cannot be used here.

This commit introduces the pre-command/post-command hooks, based on
the previous patches that culminate in support for running hooks early,
i.e. before setup_git_directory() was called.

Signed-off-by: Ben Peart <Ben.Peart@microsoft.com>
If we are going to write an object there is no use in calling
the read object hook to get an object from a potentially remote
source.  We would rather just write out the object and avoid the
potential round trip for an object that doesn't exist.

This change adds a flag to the check_and_freshen() and
freshen_loose_object() functions' signatures so that the hook
is bypassed when the functions are called before writing loose
objects. The check for a local object is still performed so we
don't overwrite something that has already been written to one
of the objects directories.

Based on a patch by Kevin Willford.

Signed-off-by: Johannes Schindelin <johasc@microsoft.com>
Suggested by Ben Peart.

Signed-off-by: Johannes Schindelin <johasc@microsoft.com>
When using the sparse-checkout feature, the file might not be on disk
because the skip-worktree bit is on.

Signed-off-by: Kevin Willford <kewillf@microsoft.com>
Signed-off-by: Alejandro Pauly <alpauly@microsoft.com>
When using the sparse-checkout feature git should not write to the working
directory for files with the skip-worktree bit on.  With the skip-worktree
bit on the file may or may not be in the working directory and if it is
not we don't want or need to create it by calling checkout_entry.

There are two callers of checkout_target.  Both of which check that the
file does not exist before calling checkout_target.  load_current which
make a call to lstat right before calling checkout_target and
check_preimage which will only run checkout_taret it stat_ret is less than
zero.  It sets stat_ret to zero and only if !stat->cached will it lstat
the file and set stat_ret to something other than zero.

This patch checks if skip-worktree bit is on in checkout_target and just
returns so that the entry doesn't not end up in the working directory.
This is so that apply will not create a file in the working directory,
then update the index but not keep the working directory up to date with
the changes that happened in the index.

Signed-off-by: Kevin Willford <kewillf@microsoft.com>
We need to respect that config setting even if we already know that we
have a repository, but have not yet read the config.

The regression test was written by Alejandro Pauly.

Signed-off-by: Johannes Schindelin <johasc@microsoft.com>
When using the sparse checkout feature the git reset command will add
entries to the index that will have the skip-worktree bit off but will
leave the working directory empty.  File data is lost because the index
version of the files has been changed but there is nothing that is in
the working directory.  This will cause the next status call to show
either deleted for files modified or deleting or nothing for files
added.  The added files should be shown as untracked and modified files
should be shown as modified.

To fix this when the reset is running if there is not a file in the
working directory and if it will be missing with the new index entry or
was not missing in the previous version, we create the previous index
version of the file in the working directory so that status will report
correctly and the files will be availble for the user to deal with.

Signed-off-by: Kevin Willford <kewillf@microsoft.com>
Signed-off-by: Kevin Willford <kewillf@microsoft.com>
jeffhostetler and others added 26 commits December 15, 2018 19:36
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Add trace2 events when reading and writing the index.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Add trace2 region and data events describing attempts to deserialize
status data using a status cache.

A category:status, label:deserialize region is pushed around the
deserialize code.

Deserialization results when reading from a file are:
    category:status, path   = <path>
    category:status, polled = <number_of_attempts>
    category:status, result = "ok" | "reject"

When reading from STDIN are:
    category:status, path   = "STDIN"
    category:status, result = "ok" | "reject"

Status will fallback and run a normal status scan when a "reject"
is reported (unless "--deserialize-wait=fail").  If "ok" is reported,
status was able to use the status cache and avoid scanning the workdir.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Add check to see if a directory is included in the virtualfilesystem
before checking the directory hashmap.  This allows a directory entry
like foo/ to find all untracked files in subdirectories.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
…uded

Add check to see if a directory is included in the virtualfilesystem
before checking the directory hashmap. This allows a directory entry
like foo/ to find all untracked files in subdirectories.
When studying the performance of 'git push' we would like to know
how much time is spent at various parts of the command. One area
that could cause performance trouble is 'git pack-objects'.

Add trace2 regions around the three main actions taken in this
command:

1. Enumerate objects.
2. Prepare pack.
3. Write pack-file.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
(Experimental) Trace2 base plus GVFS extensions
William Baker reported that the non-built-in rebase and stash fail to
run the post-command hook (which is important for VFS for Git, though).

The reason is that an `exec()` will replace the current process by the
newly-exec'ed one (our Windows-specific emulation cannot do that, and
does not even try, so this is only an issue on Linux/macOS). As a
consequence, not even the atexit() handlers are run, including the
one running the post-command hook.

To work around that, let's spawn the legacy rebase/stash and exit with
the reported exit code.
We want to make `git push` faster, but we need to know where the time is going!

There are likely four places where the time is going:

1. The info/refs call and force-update checking at the beginning.
2. The `git pack-objects` call that creates a pack-file to send to the server.
3. Sending the data to the server.
4. Waiting for the server to verify the pack-file.

This PR adds `trace2_region_` calls inside `git pack-objects` so we can track the time in item (2). The rest could be interpreted from the start and end time of the entire command after we know this region. The server-side verification is something we can track using server telemetry.
The multi-pack-index was added to the data verified by git-fsck in
ea5ae6c "fsck: verify multi-pack-index". This implementation was
based on the implementation for verifying the commit-graph, and a
copy-paste error kept the ERROR_COMMIT_GRAPH flag as the bit set
when an error appears in the multi-pack-index.

Add a new flag, ERROR_MULTI_PACK_INDEX, and use that instead.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
…tash`

William Baker reported that the non-built-in rebase and stash fail to
run the post-command hook (which is important for VFS for Git, though).

The reason is that an `exec()` will replace the current process by the
newly-exec'ed one (our Windows-specific emulation cannot do that, and
does not even try, so this is only an issue on Linux/macOS). As a
consequence, not even the atexit() handlers are run, including the
one running the post-command hook.

To work around that, let's spawn the legacy rebase/stash and exit with
the reported exit code.
This includes commits that fixup!-revert all the midx-related commits from our GVFS branch and replaces them with the exact commits that are being merged upstream. This should automatically remove the commits during our next version rebase-and-merge action.

Changes upstream:
- The builtin is called 'git multi-pack-index'.
- The command-line takes a 'write' verb and an '--object-dir' parameter.
- We no longer have a 'midx-head' or '*.midx' files.
- Instead, we have a 'multi-pack-index' file in the pack-dir.
- It no longer makes sense to specify '--update-head'
…en GVFS_MISSING_OK set

send-pack: do not check for sha1 file when GVFS_MISSING_OK set
The vfs does not correctly handle the case when there is a file
that begins with the same prefix as a directory. For example, the
following setup would encounter this issue:

    A directory contains a file named `dir1.sln` and a directory
    named `dir1/`.

    The directory `dir1` contains other files.

    The directory `dir1` is in the virtual file system list

The contents of `dir1` should be in the virtual file system, but
it is not. The contents of this directory do not have the skip
worktree bit cleared as expected. The problem is in the
`apply_virtualfilesystem(...)` function where it does not include
the trailing slash of the directory name when looking up the
position in the index to start clearing the skip worktree bit.

This fix is it include the trailing slash when finding the first
index entry from `index_name_pos(...)`.
…he post-indexchanged hook

In the reset --mixed code path, the index is created from scratch from the
given commit by the call to read_from_tree().  Since this is the code that
actually modifies the index, make sure we set the the_index.updated_skipworktree
flag which is passed to the post-indexchanged hook.

Updated the post-index-changed-hook test script to pass the --quiet flag
so that we can prevent future regressions.

Signed-off-by: Ben Peart <benpeart@microsoft.com>
…irectories

virtualfilesystem: fix case where directories not handled correctly
The following commands and options are not currently supported when working
in a GVFS repo.  Add code to detect and block these commands from executing.

1) fsck
2) gc
4) prune
5) repack
6) submodule
8) update-index --split-index
9) update-index --index-version (other than 4)
10) update-index --[no-]skip-worktree
11) worktree

Signed-off-by: Ben Peart <benpeart@microsoft.com>
…nged

update the reset --quiet path codepath to pass the correct flags to the post-indexchanged hook
gvfs: block unsupported commands when running in a GVFS repo
@dscho
Copy link
Copy Markdown
Member Author

dscho commented Dec 15, 2018

Range-diff relative to vfs-2.20.0:

 1:  c7989fe22da6 =  1:  3f588065b018 pager: fix order of atexit() calls
 2:  d18382fd16f5 =  2:  ac04954b7e92 reset --stdin: trim carriage return from the paths
 3:  962c4867291b !  3:  1166b213a26b gvfs: start by adding the -gvfs suffix to the version
    @@ -11,8 +11,8 @@
      #!/bin/sh
      
      GVF=GIT-VERSION-FILE
    --DEF_VER=v2.20.0
    -+DEF_VER=v2.20.0.vfs.1.1
    +-DEF_VER=v2.20.1
    ++DEF_VER=v2.20.1.vfs.1.1
      
      LF='
      '
 8:  ca75317c60fb =  4:  26eb0e399f03 gvfs: ensure that the version is based on a GVFS tag
 9:  186c9daf9835 =  5:  1aade282063a gvfs: add a GVFS-specific header file
10:  a6e94ddaabc8 =  6:  788870a5cb1d gvfs: add the core.gvfs config setting
11:  4ac26d8a8c73 =  7:  44122be121d2 gvfs: add the feature to skip writing the index' SHA-1
12:  09be583ef9b6 =  8:  8383f5c4f450 gvfs: add the feature that blobs may be missing
13:  ac7b2450daad =  9:  11da98050f2b gvfs: prevent files to be deleted outside the sparse checkout
14:  7b0b2e098c97 = 10:  332ed29c07ba gvfs: optionally skip reachability checks/upload pack during fetch
15:  333e48fbc2b0 = 11:  627bb8b010b2 gvfs: ensure all filters and EOL conversions are blocked
16:  6a0e1ba1eb14 = 12:  3dd70448a3a2 Add a new run_hook_argv() function
17:  940d6b0fe25e = 13:  9cd4a6d00d1e gvfs: allow "virtualizing" objects
18:  ec72be572dc4 = 14:  d6eaacef8eb5 Hydrate missing loose objects in check_and_freshen()
19:  ce8f580f9a8c = 15:  1e9f9ed75523 Add support for read-object as a background process to retrieve missing objects
20:  56d891cfec8d = 16:  cec3c33636bd sha1_file: when writing objects, skip the read_object_hook
21:  e2d9dadf6af5 = 17:  4356dd15dcfa gvfs: add global command pre and post hook procs
22:  28d6fcf48752 = 18:  688c31c2802e Allow hooks to be run before setup_git_directory()
23:  88245cca7cd7 = 19:  bc483c5775b2 gvfs: introduce pre/post command hooks
24:  c60a6f5b2aa1 = 20:  cb57e9f7a0d6 t0400: verify that the hook is called correctly from a subdirectory
 4:  dbe47e5e568a = 21:  d28bbdcee00d sparse-checkout: update files with a modify/delete conflict
25:  9dafe0790755 = 22:  d3c7dc7ffd77 Pass PID of git process to hooks.
 5:  2f031327e556 = 23:  a42b69aa25db sparse-checkout: avoid writing entries with the skip-worktree bit
26:  5ef0cf2d6238 = 24:  1e08be2d1e64 pre-command: always respect core.hooksPath
 6:  f7203f827e97 = 25:  5d2a5fc58adb Fix reset when using the sparse-checkout feature.
27:  fee970c73e6f = 26:  b30acf9736f6 Do not remove files outside the sparse-checkout
28:  c84dffd078af = 27:  662357421d07 gvfs: refactor loading the core.gvfs config value
29:  060001ed6acf = 28:  092882f322ea cache-tree: remove use of strbuf_addf in update_one
30:  bba9922caf9d = 29:  fae0da404ac1 status: add status serialization mechanism
31:  0814630182fb = 30:  0f9a27e205f1 status: add status.aheadbehind setting
32:  5c2879ff0e4e = 31:  9b162e190720 Teach ahead-behind and serialized status to play nicely together
33:  2d8923380e93 = 32:  dfb20164c834 status: add warning when a/b calculation takes too long for long/normal format
34:  ed7391f1392a = 33:  ffc0fb8ebfec status: ignore status.aheadbehind in porcelain formats
35:  e0e928f1e4c9 = 34:  a74a553794ea status: serialize to path
36:  9680d7f8c45a = 35:  2ab3909560df status: reject deserialize in V2 and conflicts
37:  e6b90e1dabb6 = 36:  933f6b028474 fetch: Add --[no-]show-forced-updates argument
38:  32d774f2762f = 37:  62dea1a397aa fetch: Warn about forced updates after branch list
39:  c452ae433233 = 38:  bbc5e9e76cb2 push: add --[no-]show-forced-updates passthrough to fetch
40:  c6feda7c2d0e = 39:  3e24efdd7c4b fetch: add documentation for --[no-]show-forced-updates
41:  c2ea8925a4ff = 40:  322557ac7686 Add virtual file system settings and hook proc
42:  df015a59b50f = 41:  2a5ff6be12f8 Update the virtualfilesystem support
43:  fa16bc0d11b0 = 42:  5de80715d569 commit: add generation to pop_most_recent_commit()
44:  52c31f58a233 = 43:  d994cd9ff8d1 status: fix rename reporting when using serialization cache
45:  8a9cfa1a45ca = 44:  c3ee8f99eb03 status: add comments for ahead_behind_flags in serialization
46:  35fae6efbbe1 = 45:  c483b6d52224 serialize-status: serialize global and repo-local exclude file metadata
47:  b8a09063abb8 = 46:  2aa6acc349a3 status: deserialization wait
48:  c6ed96ee972b = 47:  dfe5be1a53a6 virtualfilesystem: don't run the virtual file system hook if the index has been redirected
49:  578ca434c834 = 48:  c3e50d87c177 virtualfilesystem: fix bug with symlinks being ignored
50:  769724e81e10 = 49:  1354d7301ec3 trace2: create new combined trace facility
51:  a9f708e6a335 = 50:  00bac7b6f678 trace2: add trace2 tracing of major regions in wt-status
52:  e45bc60429a7 = 51:  95597bd7fc7d trace2: classify some child processes
53:  f142a158156e = 52:  d83778f2ee71 trace2: add child classification for transport processes
54:  3af982fe700b = 53:  918f51b86667 trace2: instrument reading and writing the index
55:  99ada13b304c = 54:  c7f9cb3355cb gvfs:trace2: add region/data events for status deserialization
56:  f10c0d72cbd5 = 55:  ff3b38819ea9 virtualfilesystem: check if directory is included
57:  3a564faa4c88 = 56:  e15d3e9795a2 gvfs:trace2: add trace2 tracing around read_object_process
58:  d8b62846e732 = 57:  d5c1e3b80b88 pack-objects: add trace2 regions
59:  7709e5ec9a74 = 58:  81f397c01669 rebase/stash: make post-command hook work again
 7:  7f99db266846 = 59:  39250a8760e2 fsck: use ERROR_MULTI_PACK_INDEX
60:  f7f0dbc485e4 = 60:  04cd4198b9f7 read-cache: add post-indexchanged hook
61:  6f11aded8dd2 = 61:  7454daddc4f3 read-cache: post-indexchanged hook add skip-worktree bit changing support
62:  984275e58bb7 = 62:  5e370406b955 read-cache: add test for post-indexchanged hook
63:  4e8852fe68e6 = 63:  c9ceb5a5a5eb send-pack: do not check for sha1 file when GVFS_MISSING_OK set
64:  886eddb6ce80 = 64:  66887e1c0359 Add documentation for the post-indexchanged hook
65:  d5866d6f8ba9 = 65:  a7cfd2af8ebf vfs: fix case where directories not handled correctly
66:  4a22502a318a <  -:  ------------ .gitattributes: ensure t/oid-info/* has eol=lf
69:  ef8f6c654885 = 66:  c566104f6e1f update the reset --quiet path codepath to pass the correct flags to the post-indexchanged hook
67:  319b60d4f7b9 = 67:  1d8453eec4f9 gvfs: block unsupported commands when running in a GVFS repo
68:  4275b8a5812b <  -:  ------------ t4256: mark support files as LF-only
70:  e61d6952c702 <  -:  ------------ rebase: teach `reset_head()` to optionally skip the worktree
71:  6c0d9e5685b8 <  -:  ------------ fixup! builtin rebase: call `git am` directly

(The dropped commits are expected: two made it into v2.20.1, and the other two made it into v2.20.1.windows.1)

Copy link
Copy Markdown

@derrickstolee derrickstolee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry that I didn't see this before I merged #88. I'll replay that PR against this new branch.

Thanks, @dscho!

@derrickstolee derrickstolee merged commit d5f2450 into microsoft:vfs-2.20.1 Dec 17, 2018
@dscho dscho deleted the vfs-2.20.1 branch February 8, 2019 15:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants