forked from git-for-windows/git
-
Notifications
You must be signed in to change notification settings - Fork 98
[DO NOT MERGE] Tentative vfs-2.25.1 branch #248
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
When testing the sparse-checkout feature, we need to compare the contents of the working-directory against some expected output. Using here-docs was useful in the beginning, but became repetetive as the test script grew. Create a check_files helper to make the tests simpler and easier to extend. It also reduces instances of bad here-doc whitespace. Signed-off-by: Derrick Stolee <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>
t1091-sparse-checkout-builtin.sh uses here-docs to populate the expected contents of the sparse-checkout file. These do not use shell interpolation, so use "-\EOF" instead of "-EOF". Also use proper tabbing. Signed-off-by: Derrick Stolee <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>
The 'git init' command creates the ".git/info" directory and fills it with some default files. However, 'git worktree add' does not create the info directory for that worktree. This causes a problem when running "git sparse-checkout init" inside a worktree. While care was taken to allow the sparse-checkout config to be specific to a worktree, this initialization was untested. Safely create the leading directories for the sparse-checkout file. This is the safest thing to do even without worktrees, as a user could delete their ".git/info" directory and expect Git to recover safely. Signed-off-by: Derrick Stolee <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>
The --sparse option was added to the clone builtin in d89f09c (clone: add --sparse mode, 2019-11-21) and was tested with a local path clone in t1091-sparse-checkout-builtin.sh. However, due to a difference in how local paths are handled versus URLs, this mechanism does not work with URLs. Modify the test to use a "file://" URL, which would output this error before the code change: Cloning into 'clone'... fatal: cannot change to 'file://.../repo': No such file or directory error: failed to initialize sparse-checkout These errors are due to using a "-C <path>" option to call 'git -C <path> sparse-checkout init' but the URL is being given instead of the target directory. Update that target directory to evaluate this correctly. I have also manually tested that https:// URLs are handled correctly as well. Acked-by: Taylor Blau <[email protected]> Signed-off-by: Derrick Stolee <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>
Signed-off-by: Jeff King <[email protected]> Signed-off-by: Derrick Stolee <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>
When core.sparseCheckoutCone is enabled, the 'git sparse-checkout set' command creates a restricted set of possible patterns that are used by a custom algorithm to quickly match those patterns. If a user manually edits the sparse-checkout file, then they could create patterns that do not match these expectations. The cone-mode matching algorithm can return incorrect results. The solution is to detect these incorrect patterns, warn that we do not recognize them, and revert to the standard algorithm. Check each pattern for the "**" substring, and revert to the old logic if seen. While technically a "/<dir>/**" pattern matches the meaning of "/<dir>/", it is not one that would be written by the sparse-checkout builtin in cone mode. Attempting to accept that pattern change complicates the logic and instead we punt and do not accept any instance of "**". Signed-off-by: Derrick Stolee <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>
In cone mode, the shortest pattern the sparse-checkout command will write into the sparse-checkout file is "/*". This is handled carefully in add_pattern_to_hashsets(), so warn if any other pattern is this short. This will assist future pattern checks by allowing us to assume there are at least three characters in the pattern. Signed-off-by: Derrick Stolee <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>
In cone mode, the sparse-checkout commmand will write patterns that allow faster pattern matching. This matching only works if the patterns in the sparse-checkout file are those written by that command. Users can edit the sparse-checkout file and create patterns that cause the cone mode matching to fail. The cone mode patterns may end in "/*" but otherwise an un-escaped asterisk or other glob character is invalid. Add checks to disable cone mode when seeing these values. A later change will properly handle escaped globs. Signed-off-by: Derrick Stolee <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>
In cone mode, the sparse-checkout feature uses hashset containment queries to match paths. Make this algorithm respect escaped asterisk (*) and backslash (\) characters. Create dup_and_filter_pattern() method to convert a pattern by removing escape characters and dropping an optional "/*" at the end. This method is available in dir.h as we will use it in builtin/sparse-checkout.c in a later change. Signed-off-by: Derrick Stolee <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>
If a user somehow creates a directory with an asterisk (*) or backslash (\), then the "git sparse-checkout set" command will struggle to provide the correct pattern in the sparse-checkout file. When not in cone mode, the provided pattern is written directly into the sparse-checkout file. However, in cone mode we expect a list of paths to directories and then we convert those into patterns. However, there is some care needed for the timing of these escapes. The in-memory pattern list is used to update the working directory before writing the patterns to disk. Thus, we need the command to have the unescaped names in the hashsets for the cone comparisons, then escape the patterns later. Signed-off-by: Derrick Stolee <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>
If a user somehow creates a directory with an asterisk (*) or backslash (\), then the "git sparse-checkout set" command will struggle to provide the correct pattern in the sparse-checkout file. When not in cone mode, the provided pattern is written directly into the sparse-checkout file. However, in cone mode we expect a list of paths to directories and then we convert those into patterns. Even more specifically, the goal is to always allow the following from the root of a repo: git ls-tree --name-only -d HEAD | git sparse-checkout set --stdin The ls-tree command provides directory names with an unescaped asterisk. It also quotes the directories that contain an escaped backslash. We must remove these quotes, then keep the escaped backslashes. Use unquote_c_style() when parsing lines from stdin. Command-line arguments will be parsed as-is, assuming the user can do the correct level of escaping from their environment to match the exact directory names. Signed-off-by: Derrick Stolee <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>
When in cone mode, the 'git sparse-checkout list' subcommand lists the directories included in the sparse cone. When these directories contain odd characters, such as a backslash, then we need to use C-style quotes similar to 'git ls-tree'. Signed-off-by: Derrick Stolee <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>
The sparse-checkout patterns allow special globs according to fnmatch(3). When writing cone-mode patterns for paths containing these characters, they must be escaped. Use is_glob_special() to check which characters must be escaped this way, and add a path to the tests that contains all glob characters at once. Note that ']' is not special, since the initial bracket '[' is escaped. Reported-by: Jeff King <[email protected]> Signed-off-by: Derrick Stolee <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>
The existing documentation does not clarify how the 'set' subcommand changes when core.sparseCheckoutCone is enabled. Correct this by changing some language around the "A/B/C" example. Also include a description of the input format matching the output of 'git ls-tree --name-only'. Helped-by: Jeff King <[email protected]> Signed-off-by: Derrick Stolee <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>
The intention of the special "cone mode" in the sparse-checkout feature is to always match the same patterns that are matched by the same sparse-checkout file as when cone mode is disabled. When a file path is given to "git sparse-checkout set" in cone mode, then the cone mode improperly matches the file as a recursive path. When setting the skip-worktree bits, files were not expecting the MATCHED_RECURSIVE response, and hence these were left out of the matched cone. Fix this bug by checking for MATCHED_RECURSIVE in addition to MATCHED and add a test that prevents regression. Reported-by: Finn Bryant <[email protected]> Signed-off-by: Derrick Stolee <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>
In anticipation of extending the sparse-checkout builtin with "add" and "remove" subcommands, extract the code that fills a pattern list based on the input values. The input changes depending on the presence of "--stdin" or the value of core.sparseCheckoutCone. Signed-off-by: Derrick Stolee <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>
In anticipation of adding "add" and "remove" subcommands to the sparse-checkout builtin, extract a modify_pattern_list() method from the sparse_checkout_set() method. This command will read input from the command-line or stdin to construct a set of patterns, then modify the existing sparse-checkout patterns after a successful update of the working directory. Currently, the only way to modify the patterns is to replace all of the patterns. This will be extended in a later update. Signed-off-by: Derrick Stolee <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>
When using the sparse-checkout feature, a user may want to incrementally grow their sparse-checkout pattern set. Allow adding patterns using a new 'add' subcommand. This is not much different from the 'set' subcommand, because we still want to allow the '--stdin' option and interpret inputs as directories when in cone mode and patterns otherwise. When in cone mode, we are growing the cone. This may actually reduce the set of patterns when adding directory A when A/B is already a directory in the cone. Test the different cases: siblings, parents, ancestors. When not in cone mode, we can only assume the patterns should be appended to the sparse-checkout file. Signed-off-by: Derrick Stolee <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>
When using Windows, a user may run 'git sparse-checkout set A\B\C' to add the Unix-style path A/B/C to their sparse-checkout patterns. Normalizing the input path converts the backslashes to slashes before we add the string 'A/B/C' to the recursive hashset. Signed-off-by: Derrick Stolee <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>
While using the reset --stdin feature on windows path added may have a \r at the end of the path that wasn't getting removed so didn't match the path in the index and wasn't reset. Signed-off-by: Kevin Willford <[email protected]>
Signed-off-by: Saeed Noursalehi <[email protected]>
Signed-off-by: Johannes Schindelin <[email protected]>
This header file will accumulate GVFS-specific definitions. Signed-off-by: Kevin Willford <[email protected]>
This does not do anything yet. The next patches will add various values for that config setting that correspond to the various features offered/required by GVFS. Signed-off-by: Kevin Willford <[email protected]>
This takes a substantial amount of time, and if the user is reasonably sure that the files' integrity is not compromised, that time can be saved. Git no longer verifies the SHA-1 by default, anyway. Signed-off-by: Kevin Willford <[email protected]>
Signed-off-by: Kevin Willford <[email protected]>
Prevent the sparse checkout to delete files that were marked with skip-worktree bit and are not in the sparse-checkout file. This is because everything with the skip-worktree bit turned on is being virtualized and will be removed with the change of HEAD. There was only one failing test when running with these changes that was checking to make sure the worktree narrows on checkout which was expected since we would no longer be narrowing the worktree. Signed-off-by: Kevin Willford <[email protected]>
While performing a fetch with a virtual file system we know that there will be missing objects and we don't want to download them just because of the reachability of the commits. We also don't want to download a pack file with commits, trees, and blobs since these will be downloaded on demand. This flag will skip the first connectivity check and by returning zero will skip the upload pack. It will also skip the second connectivity check but continue to update the branches to the latest commit ids. Signed-off-by: Kevin Willford <[email protected]>
The two existing members of the run_hook*() family, run_hook_ve() and run_hook_le(), are good for callers that know the precise number of parameters already. Let's introduce a new sibling that takes an argv array for callers that want to pass a variable number of parameters. Signed-off-by: Johannes Schindelin <[email protected]>
Signed-off-by: Jeff Hostetler <[email protected]>
The fsmonitor script that can be used for running all the git tests using watchman was causing some of the tests to fail because it wrote to stderr and created some files for debugging purposes. Add a new debug script to use with debugging and modify the other script to remove the code that would cause tests to fail. Signed-off-by: Kevin Willford <[email protected]>
test-gvfs-prococol, t5799: tests for gvfs-helper
Signed-off-by: Derrick Stolee <[email protected]>
fsmonitor updates for improved performance
Signed-off-by: Derrick Stolee <[email protected]>
The gvfs-helper allows us to download prefetch packs using a simple subprocess call. The gvfs-helper-client.h method will automatically compute the timestamp if passing 0, and passing NULL for the number of downloaded packs is valid. Signed-off-by: Derrick Stolee <[email protected]>
This replaces #223. There was a strangely-subtle issue about reading the trailing hash from the downloaded packs that caused issues when reading from the origin remote. Add `gvfs-helper prefetch` command line option and `objects.prefetch` mode in `gvfs-helper server`. Sorry, but this contains a major refactor of the packfile and loose file handling to let me share it with the prefetch code. As a side benefit, I collapsed the tempfile creation before the request goes out and merged the install_ code after the result is returned. I also changed packfile code to use the packfile-checksum rather than a timestamp so that we look more like normal Git. More details are in the commit message.
Teach gvfs-helper to better support the concurrent fetching of the same packfile by multiple instances. If 2 instances of gvfs-helper did a POST and requested the same set of OIDs, they might receive the exact same packfile (same checksum SHA). Both processes would then race to install their copy of the .pack and .idx files into the ODB/pack directory. This is not a problem on Unix (because of filesystem semantics). On Windows, this can cause an EBUSY/EPERM problem for the loser while the winner is holding a handle to the target files. (The existing packfile code already handled simple the existence and/or replacement case.) The solution presented here is to silently let the loser claim victory IIF the .pack and .idx are already present in the ODB. (We can't check this in advance because we don't know the packfile SHA checksum until after we receive it and run index-pack.) We avoid using a per-packfile lockfile (or a single lockfile for the `vfs-` prefix) to avoid the usual issues with stale lockfiles. Signed-off-by: Jeff Hostetler <[email protected]>
This is a follow-up to #227. 1. When a new flag is added to our Git config, we can run `gvfs-helper prefetch` inside of our `git fetch` calls. This will help ensure we have updated commits and trees even if the background prefetches have fallen behind (or are not running). 2. With a new `--no-update-remote-refs` we can avoid updating the `refs/remotes` namespace. This will allow us to run `git fetch --all --no-update-remote-refs +refs/heads/*:refs/hidden/*` and we will get the new refs into a local folder (that doesn't appear anywhere). The most important thing is that users will still see when their remote refs update.
When we create temp files for downloading packs, we use a name based on the current timestamp. There is no randomness in the name, so we can have collisions in the same second. Retry the temp pack names using a new "-<retry>" suffix to the name before the ".temp". Signed-off-by: Derrick Stolee <[email protected]>
gvfs-helper: better support for concurrent packfile fetches
When we create temp files for downloading packs, we use a name based on the current timestamp. There is no randomness in the name, so we can have collisions in the same second. Retry the temp pack names using a new "-<retry>" suffix to the name before the ".temp". This is a follow-up to #229.
When using the GVFS protocol, we should _never_ call "git fetch-pack" to attempt downloading a pack-file via the regular Git protocol. It appears that the mechanism that prevented this in the VFS for Git world is due to the read-object hook populating the commits at the new ref tips in a different way than the gvfs-helper does. By acting as if the fetch-pack succeeds here in remote-curl, we prevent a failed fetch. Signed-off-by: Derrick Stolee <[email protected]>
This reverts commit cff4e91. This is temporary until we fix this behavior upstream. For now, we need to allow the sparse-checkout command to run when the status is not clean. Signed-off-by: Derrick Stolee <[email protected]>
If a user runs git update-git-for-windows, then they will upgrade to a version that does not support microsoft/vfsforgit or microsoft/scalar. Therefore, let's prevent this. This addresses #241
…g gvfs helper The `gvfs-helper` is supposed to avoid calling `git fetch-pack` by downloading objects through the GVFS protocol instead. For some reason, some `git fetch` calls still end up calling `git fetch-pack` which gets a complaint from the remote because it does not support that kind of fetch. Put a hard stop in the `fetch_git()` method to prevent this process run.
Disable `git update-git-for-windows`
b316462
to
e3794d5
Compare
Signed-off-by: Derrick Stolee <[email protected]>
e3794d5
to
9934fcf
Compare
When computing changed-path Bloom filters or performing a name-only diff, we do not need the blob contents before completing the diff values. Thus, we do not need to download a pack containing the blobs we do not have on-disk before completing our diff calculation. This prevents downloading every blob in a partial clone during "git log --raw" and "git diff --name-only" commands. Signed-off-by: Derrick Stolee <[email protected]>
/azp run microsoft.git |
Azure Pipelines successfully started running 1 pipeline(s). |
derrickstolee
added a commit
to microsoft/scalar
that referenced
this pull request
Feb 21, 2020
derrickstolee
added a commit
to microsoft/VFSForGit
that referenced
this pull request
Feb 21, 2020
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Here is a rebase of
vfs-2.25.0
ontov2.25.1.windows.1
. There were only a few issues along the way. I dropped our reverts of somedir.c
commits that we took in thev2.25.0
update because they have been fixed (supposedly) inv2.25.1
. VFS for Git and Scalar tests should demonstrate if these are truly fixed.Range diff: