forked from git/git
-
Notifications
You must be signed in to change notification settings - Fork 4
[RFC] Multi-pack Index (MIDX) #1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
Closed
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add a technical documentation file describing the design for the multi-pack index (MIDX). Includes current limitations and future work.
A multi-pack-index (MIDX) file indexes the objects in multiple packfiles in a single pack directory. After a simple fixed-size header, the version 1 file format uses chunks to specify different regions of the data that correspond to different types of data, including: - List of OIDs in lex-order - A fanout table into the OID list - List of packfile names (relative to pack directory) - List of object metadata - Large offsets (if needed) By adding extra optional chunks, we can easily extend this format without invalidating written v1 files. One value in the header corresponds to a number of "base MIDX files" and will always be zero until the value is used in a later patch. We considered using a hashtable format instead of an ordered list of objects to reduce the O(log N) lookups to O(1) lookups, but two main issues arose that lead us to abandon the idea: - Extra space required to ensure collision counts were low. - We need to identify the two lexicographically closest OIDs for fast abbreviations. Binary search allows this. The current solution presents multiple packfiles as if they were packed into a single packfile with one pack-index.
As the multi-pack-index feature is being developed, we use a config setting 'core.midx' to enable all use of MIDX files. Since MIDX files are designed as a way to augment the existing data stores in Git, turning this setting off will revert to previous behavior without needing to downgrade. This can also be a repo- specific setting if the MIDX is misbehaving in only one repo.
The write_midx_file() method takes a list of packfiles and indexed objects with offset information and writes according to the format in Documentation/technical/pack-format.txt. The chunks are separated into methods.
Create, document, and implement the first ability of the midx builtin. The --write subcommand creates a multi-pack-index for all indexed packfiles within a given pack directory. If none is provided, the objects/pack directory is implied. The arguments allow specifying the pack directory so we can add MIDX files to alternates. The packfiles are expected to be paired with pack-indexes and are otherwise ignored. This simplifies the implementation and also keeps compatibility with older versions of Git (or changing core.midx to false).
Test interactions between the midx builtin and other Git operations. Use both a full repo and a bare repo to ensure the pack directory redirection works correctly.
There may be multiple MIDX files in a single pack directory. The primary file is pointed to by a pointer file "midx-head" that contains an OID. The MIDX file to load is then given by "midx-<OID>.midx". This head file will be especially important when the MIDX files are extended to be incremental and we expect multiple MIDX files at any point.
Add a "--read" subcommand to the midx builtin to report summary information on the head MIDX file or a MIDX file specified by the supplied "--midx-id" parameter. This subcommand is used by t5318-midx.sh to verify the indexed objects are as expected.
The MIDX file stores pack offset information for a list of objects. The nth_midxed_object_* methods provide ways to extract this information.
When writing a new MIDX file, it is faster to use an existing MIDX file to load the object list and pack offsets and to only inspect pack-indexes for packs not already covered by the MIDX file.
As a way to troubleshoot unforeseen problems with MIDX files, provide a way to delete "midx-head" and the MIDX it references.
As we write new MIDX files, the existing files are probably not needed. Supply the "--delete-expired" flag to remove these files during the "--write" sub- command.
Perform some basic read-only operations that load objects and find abbreviations. As this functionality begins to reference MIDX files, ensure the output matches when using MIDX files and when not using them.
Replace prepare_packed_git() with prepare_packed_git_internal(use_midx) to allow some consumers of prepare_packed_git() with a way to load MIDX files. Consumers should only use the new method if they are prepared to use the midxed_git struct alongside the packed_git struct. If a packfile is found that is not referenced by the current MIDX, then add it to the packed_git struct. This is important to keep the MIDX useful after adding packs due to "fetch" commands and when third-party tools (such as libgit2) add packs directly to the repo. If prepare_packed_git_internal is called with use_midx = 0, then unload the MIDX file and reload the packfiles in to the packed_git struct.
The MIDX files contain a complete object count, so we can report the number of objects in the MIDX. The count remains approximate as there may be overlap between the packfiles not referenced by the MIDX.
Using a binary search, we can navigate to the position n within a MIDX file where an object appears in the ordered list of objects.
Create unique_in_midx() to mimic behavior of unique_in_pack(). Create find_abbrev_len_for_midx() to mimic behavior of find_abbrev_len_for_pack(). Consume these methods when interacting with abbreviations.
When looking for a packed object, first check the MIDX for that object. This reduces thrashing in the MRU list of packfiles.
derrickstolee
pushed a commit
that referenced
this pull request
Apr 2, 2018
The function ce_write_entry() uses a 'self-initialised' variable construct, for the symbol 'saved_namelen', to suppress a gcc '-Wmaybe-uninitialized' warning, given that the warning is a false positive. For the purposes of this discussion, the ce_write_entry() function has three code blocks of interest, that look like so: /* block #1 */ if (ce->ce_flags & CE_STRIP_NAME) { saved_namelen = ce_namelen(ce); ce->ce_namelen = 0; } /* block #2 */ /* * several code blocks that contain, among others, calls * to copy_cache_entry_to_ondisk(ondisk, ce); */ /* block #3 */ if (ce->ce_flags & CE_STRIP_NAME) { ce->ce_namelen = saved_namelen; ce->ce_flags &= ~CE_STRIP_NAME; } The warning implies that gcc thinks it is possible that the first block is not entered, the calls to copy_cache_entry_to_ondisk() could toggle the CE_STRIP_NAME flag on, thereby entering block #3 with saved_namelen unset. However, the copy_cache_entry_to_ondisk() function does not write to ce->ce_flags (it only reads). gcc could easily determine this, since that function is local to this file, but it obviously doesn't. In order to suppress this warning, we make it clear to the reader (human and compiler), that block #3 will only be entered when the first block has been entered, by introducing a new 'stripped_name' boolean variable. We also take the opportunity to change the type of 'saved_namelen' to 'unsigned int' to match ce->ce_namelen. Signed-off-by: Ramsay Jones <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>
derrickstolee
pushed a commit
that referenced
this pull request
Jul 9, 2018
Change "fetch" to treat "+" in refspecs (aka --force) to mean we should clobber a local tag of the same name. This changes the long-standing behavior of "fetch" added in 853a369 ("[PATCH] Multi-head fetch.", 2005-08-20), before this change all tag fetches effectively had --force enabled. The original rationale in that change was: > Tags need not be pointing at commits so there is no way to > guarantee "fast-forward" anyway. That comment and the rest of the history of "fetch" shows that the "+" (--force) part of refpecs was only conceived for branch updates, while tags have accepted any changes from upstream unconditionally and clobbered the local tag object. Changing this behavior has been discussed as early as 2011[1]. I the current behavior doesn't make sense, it easily results in local tags accidentally being clobbered. Ideally we'd namespace our tags per-remote, but as with my 97716d2 ("fetch: add a --prune-tags option and fetch.pruneTags config", 2018-02-09) it's easier to work around the current implementation than to fix the root cause, so this implements suggestion #1 from [1], "fetch" now only clobbers the tag if either "+" is provided as part of the refspec, or if "--force" is provided on the command-line. This also makes it nicely symmetrical with how "tag" itself works. We'll now refuse to clobber any existing tags unless "--force" is supplied, whether that clobbering would happen by clobbering a local tag with "tag", or by fetching it from the remote with "fetch". It's still not at all nicely symmetrical with how "git push" works, as discussed in the updated pull-fetch-param.txt documentation, but this change brings them more into line with one another. I don't think there's any reason "fetch" couldn't fully converge with the behavior used by "push", but that's a topic for another change. One of the tests added in 31b808a ("clone --single: limit the fetch refspec to fetched branch", 2012-09-20) is being changed to use --force where a clone would clobber a tag. This changes nothing about the existing behavior of the test. 1. https://public-inbox.org/git/[email protected]/ Signed-off-by: Ævar Arnfjörð Bjarmason <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>
derrickstolee
pushed a commit
that referenced
this pull request
Aug 21, 2018
When "git rebase -i" is told to squash two or more commits into one, it labeled the log message for each commit with its number. It correctly called the first one "1st commit", but the next one was "commit #1", which was off-by-one. This has been corrected. * pw/rebase-i-squash-number-fix: rebase -i: fix numbering in squash message
derrickstolee
pushed a commit
that referenced
this pull request
Aug 30, 2018
The verbose output of the test 'reword without issues functions as intended' in 't3423-rebase-reword.sh', added in a9279c6 (sequencer: do not squash 'reword' commits when we hit conflicts, 2018-06-19), contains the following error output: sed: -e expression #1, char 2: extra characters after command This error comes from within the 'fake-editor.sh' script created by 'lib-rebase.sh's set_fake_editor() function, and the root cause is the FAKE_LINES="pick 1 reword 2" variable in the test in question, in particular the "pick" word. 'fake-editor.sh' assumes 'pick' to be the default rebase command and doesn't support an explicit 'pick' command in FAKE_LINES. As a result, 'pick' will be used instead of a line number when assembling the following 'sed' script: sed -n picks/^pick/pick/p which triggers the aforementioned error. Luckily, this didn't affect the test's correctness: the erroring 'sed' command doesn't write anything to the todo script, and processing the rest of FAKE_LINES generates the desired todo script, as if that 'pick' command were not there at all. The minimal fix would be to remove the 'pick' word from FAKE_LINES, but that would leave us susceptible to similar issues in the future. Instead, teach the fake-editor script to recognize an explicit 'pick' command, which is still a fairly trivial change. In the future we might want to consider reinforcing this fake editor script with an &&-chain and stricter parsing of the FAKE_LINES variable (e.g. to error out when encountering unknown rebase commands or commands and line numbers in the wrong order). Signed-off-by: SZEDER Gábor <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>
derrickstolee
pushed a commit
that referenced
this pull request
Sep 11, 2018
When "git rebase -i" is told to squash two or more commits into one, it labeled the log message for each commit with its number. It correctly called the first one "1st commit", but the next one was "commit #1", which was off-by-one. This has been corrected. * pw/rebase-i-squash-number-fix: rebase -i: fix numbering in squash message
derrickstolee
pushed a commit
that referenced
this pull request
Sep 17, 2018
serialize-status: serialize global and repo-local exclude file metadata
derrickstolee
pushed a commit
that referenced
this pull request
Sep 18, 2018
Change "fetch" to treat "+" in refspecs (aka --force) to mean we should clobber a local tag of the same name. This changes the long-standing behavior of "fetch" added in 853a369 ("[PATCH] Multi-head fetch.", 2005-08-20). Before this change, all tag fetches effectively had --force enabled. See the git-fetch-script code in fast_forward_local() with the comment: > Tags need not be pointing at commits so there is no way to > guarantee "fast-forward" anyway. That commit and the rest of the history of "fetch" shows that the "+" (--force) part of refpecs was only conceived for branch updates, while tags have accepted any changes from upstream unconditionally and clobbered the local tag object. Changing this behavior has been discussed as early as 2011[1]. The current behavior doesn't make sense to me, it easily results in local tags accidentally being clobbered. We could namespace our tags per-remote and not locally populate refs/tags/*, but as with my 97716d2 ("fetch: add a --prune-tags option and fetch.pruneTags config", 2018-02-09) it's easier to work around the current implementation than to fix the root cause. So this change implements suggestion #1 from Jeff's 2011 E-Mail[1], "fetch" now only clobbers the tag if either "+" is provided as part of the refspec, or if "--force" is provided on the command-line. This also makes it nicely symmetrical with how "tag" itself works when creating tags. I.e. we refuse to clobber any existing tags unless "--force" is supplied. Now we can refuse all such clobbering, whether it would happen by clobbering a local tag with "tag", or by fetching it from the remote with "fetch". Ref updates outside refs/{tags,heads/* are still still not symmetrical with how "git push" works, as discussed in the recently changed pull-fetch-param.txt documentation. This change brings the two divergent behaviors more into line with one another. I don't think there's any reason "fetch" couldn't fully converge with the behavior used by "push", but that's a topic for another change. One of the tests added in 31b808a ("clone --single: limit the fetch refspec to fetched branch", 2012-09-20) is being changed to use --force where a clone would clobber a tag. This changes nothing about the existing behavior of the test. 1. https://public-inbox.org/git/[email protected]/ Signed-off-by: Ævar Arnfjörð Bjarmason <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>
derrickstolee
pushed a commit
that referenced
this pull request
Oct 8, 2018
check_one_conflict() compares `i` to `active_nr` in two places to avoid buffer overruns, but left out an important third location. The code did used to have a check here comparing i to active_nr, back before commit fb70a06 ("rerere: fix an off-by-one non-bug", 2015-06-28), however the code at the time used an 'if' rather than a 'while' meaning back then that this loop could not have read past the end of the array, making the check unnecessary and it was removed. Unfortunately, in commit 5eda906 ("rerere: handle conflicts with multiple stage #1 entries", 2015-07-24), the 'if' was changed to a 'while' and the check comparing i and active_nr was not re-instated, leading to this problem. Signed-off-by: Elijah Newren <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>
derrickstolee
pushed a commit
that referenced
this pull request
Oct 30, 2018
Reimplement oidset using khash.h in order to reduce its memory footprint and make it faster. Performance of a command that mainly checks for duplicate objects using an oidset, with master and Clang 6.0.1: $ cmd="./git-cat-file --batch-all-objects --unordered --buffer --batch-check='%(objectname)'" $ /usr/bin/time $cmd >/dev/null 0.22user 0.03system 0:00.25elapsed 99%CPU (0avgtext+0avgdata 48484maxresident)k 0inputs+0outputs (0major+11204minor)pagefaults 0swaps $ hyperfine "$cmd" Benchmark #1: ./git-cat-file --batch-all-objects --unordered --buffer --batch-check='%(objectname)' Time (mean ± σ): 250.0 ms ± 6.0 ms [User: 225.9 ms, System: 23.6 ms] Range (min … max): 242.0 ms … 261.1 ms And with this patch: $ /usr/bin/time $cmd >/dev/null 0.14user 0.00system 0:00.15elapsed 100%CPU (0avgtext+0avgdata 41396maxresident)k 0inputs+0outputs (0major+8318minor)pagefaults 0swaps $ hyperfine "$cmd" Benchmark #1: ./git-cat-file --batch-all-objects --unordered --buffer --batch-check='%(objectname)' Time (mean ± σ): 151.9 ms ± 4.9 ms [User: 130.5 ms, System: 21.2 ms] Range (min … max): 148.2 ms … 170.4 ms Initial-patch-by: Jeff King <[email protected]> Signed-off-by: Rene Scharfe <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>
derrickstolee
pushed a commit
that referenced
this pull request
Oct 30, 2018
Ever since the split index feature was introduced [1], refreshing a split index is prone to a variant of the classic racy git problem. There are a couple of unrelated tests in the test suite that occasionally fail when run with 'GIT_TEST_SPLIT_INDEX=yes', but 't1700-split-index.sh', the only test script focusing solely on split index, has never noticed this issue, because it only cares about how the index is split under various circumstances and all the different ways to turn the split index feature on and off. Add a dedicated test script 't1701-racy-split-index.sh' to exercise the split index feature in racy situations as well; kind of a "t0010-racy-git.sh for split index" but with modern style (the tests do everything in &&-chained list of commands in 'test_expect_...' blocks, and use 'test_cmp' for more informative output on failure). The tests cover the following sequences of index splitting, updating, and racy file modifications, with the last two cases demonstrating the racy split index problem: 1. Split the index while adding a racily clean file: echo "cached content" >file git update-index --split-index --add file echo "dirty worktree" >file # size stays the same This case already works properly. Even though the cache entry's stat data matches with the modifid file in the worktree, subsequent git commands will notice that the (split) index and the file have the same mtime, and then will go on to check the file's content and notice its dirtiness. 2. Add a racily clean file to an already split index: git update-index --split-index echo "cached content" >file git update-index --add file echo "dirty worktree" >file This case already works properly. After the second 'git update-index' writes the newly added file's cache entry to the new split index, it basically works in the same way as case #1. 3. Split the index when it (i.e. the not yet splitted index) contains a racily clean cache entry, i.e. an entry whose cached stat data matches with the corresponding file in the worktree and the cached mtime matches that of the index: echo "cached content" >file git update-index --add file echo "dirty worktree" >file # ... wait ... git update-index --split-index --add other-file This case already works properly. The shared index is written by do_write_index(), i.e. the same function that is responsible for writing "regular" and split indexes as well. This function cleverly notices the racily clean cache entry, and writes the entry to the new shared index with smudged stat data, i.e. file size set to 0. When subsequent git commands read the index, they will notice that the smudged stat data doesn't match with the file in the worktree, and then go on to check the file's content and notice its dirtiness. 4. Update the split index when it contains a racily clean cache entry: git update-index --split-index echo "cached content" >file git update-index --add file echo "dirty worktree" >file # ... wait ... git update-index --add other-file This case already works properly. After the second 'git update-index' the newly added file's cache entry is only stored in the split index. If a cache entry is present in the split index (even if it is a replacement of an outdated entry in the shared index), then it will always be included in the new split index on subsequent split index updates (until the file is removed or a new shared index is written), independently from whether the entry is racily clean or not. When do_write_index() writes the new split index, it notices the racily clean cache entry, and smudges its stat date. Subsequent git commands reading the index will notice the smudged stat data and then go on to check the file's content and notice its dirtiness. 5. Update the split index when a racily clean cache entry is stored only in the shared index: echo "cached content" >file git update-index --split-index --add file echo "dirty worktree" >file # ... wait ... git update-index --add other-file This case fails due to the racy split index problem. In the second 'git update-index' prepare_to_write_split_index() decides, among other things, which cache entries stored only in the shared index should be replaced in the new split index. Alas, this function never looks out for racily clean cache entries, and since the file's stat data in the worktree hasn't changed since the shared index was written, the entry won't be replaced in the new split index. Consequently, do_write_index() doesn't even get this racily clean cache entry, and can't smudge its stat data. Subsequent git commands will then see that the index has more recent mtime than the file and that the (not smudged) cached stat data still matches with the file in the worktree, and, ultimately, will erroneously consider the file clean. 6. Update the split index after unpack_trees() copied a racily clean cache entry from the shared index: echo "cached content" >file git update-index --split-index --add file echo "dirty worktree" >file # ... wait ... git read-tree -m HEAD This case fails due to the racy split index problem. This basically fails for the same reason as case #5 above, but there is one important difference, which warrants the dedicated test. While that second 'git update-index' in case #5 updates index_state in place, in this case 'git read-tree -m' calls unpack_trees(), which throws out the entire index, and constructs a new one from the (potentially updated) copies of the original's cache entries. Consequently, when prepare_to_write_split_index() gets to work on this reconstructed index, it takes a different code path than in case #5 when deciding which cache entries in the shared index should be replaced. The result is the same, though: the racily clean cache entry goes unnoticed, it isn't added to the split index with smudged stat data, and subsequent git commands will then erroneously consider the file clean. Note that in the last two 'test_expect_failure' cases I omitted the '#' (as in nr. of trial) from the tests' description on purpose for now, as it breakes the TAP output [2]; it will be added at the end of the series, when those two tests will be flipped to 'test_expect_success'. [1] In the branch leading to the merge commit v2.1.0-rc0~45 (Merge branch 'nd/split-index', 2014-07-16). [2] In the TAP output a '#' should separate the test's description from the TODO directive emitted by 'test_expect_failure'. The additional '#' in "#$trial" interferes with this, the test harness won't recognize the TODO directive, and will report that those tests failed unexpectedly. Signed-off-by: SZEDER Gábor <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>
derrickstolee
pushed a commit
that referenced
this pull request
Nov 1, 2018
serialize-status: serialize global and repo-local exclude file metadata
derrickstolee
pushed a commit
that referenced
this pull request
Apr 4, 2024
When "update-index --unresolve $path" cannot find the resolve-undo record for the path the user requested to unresolve, it stuffs the blobs from HEAD and MERGE_HEAD to stage #2 and stage #3 as a fallback. For this reason, the operation does not even start unless both "HEAD" and "MERGE_HEAD" exist. This is suboptimal in a few ways: * It does not recreate stage #1. Even though it is a correct design decision not to do so (because it is impossible to recreate in general cases, without knowing how we got there, including what merge strategy was used), it is much less useful not to have that information in the index. * It limits the "unresolve" operation only during a conflicted "git merge" and nothing else. Other operations like "rebase", "cherry-pick", and "switch -m" may result in conflicts, and the user may want to unresolve the conflict that they incorrectly resolved in order to redo the resolution, but the fallback would not kick in. * Most importantly, the entire "unresolve" operation is disabled after a conflicted merge is committed and MERGE_HEAD is removed, even though the index has perfectly usable resolve-undo records. By lazily reading the HEAD and MERGE_HEAD only when we need to go to the fallback codepath, we will allow cases where resolve-undo records are available (which is 100% of the time, unless the user is reading from an index file created by Git more than 10 years ago) to proceed even after a conflicted merge was committed, during other mergy operations that do not use MERGE_HEAD, or after the result of such mergy operations has been committed. Signed-off-by: Junio C Hamano <[email protected]>
derrickstolee
pushed a commit
that referenced
this pull request
Apr 4, 2024
It is tempting to think of "files and directories" of the current directory as valid inputs to the add and set subcommands of git sparse-checkout. However, in non-cone mode, they often aren't and using them as potential completions leads to *many* forms of confusion: Issue #1. It provides the *wrong* files and directories. For git sparse-checkout add we always want to add files and directories not currently in our sparse checkout, which means we want file and directories not currently present in the current working tree. Providing the files and directories currently present is thus always wrong. For git sparse-checkout set we have a similar problem except in the subset of cases where we are trying to narrow our checkout to a strict subset of what we already have. That is not a very common scenario, especially since it often does not even happen to be true for the first use of the command; for years we required users to create a sparse-checkout via git sparse-checkout init git sparse-checkout set <args...> (or use a clone option that did the init step for you at clone time). The init command creates a minimal sparse-checkout with just the top-level directory present, meaning the set command has to be used to expand the checkout. Thus, only in a special and perhaps unusual cases would any of the suggestions from normal file and directory completion be appropriate. Issue #2: Suggesting patterns that lead to warnings is unfriendly. If the user specifies any regular file and omits the leading '/', then the sparse-checkout command will warn the user that their command is problematic and suggest they use a leading slash instead. Issue #3: Completion gets confused by leading '/', and provides wrong paths. Users often want to anchor their patterns to the toplevel of the repository, especially when listing individual files. There are a number of reasons for this, but notably even sparse-checkout encourages them to do so (as noted above). However, if users do so (via adding a leading '/' to their pattern), then bash completion will interpret the leading slash not as a request for a path at the toplevel of the repository, but as a request for a path at the root of the filesytem. That means at best that completion cannot help with such paths, and if it does find any completions, they are almost guaranteed to be wrong. Issue #4: Suggesting invalid patterns from subdirectories is unfriendly. There is no per-directory equivalent to .gitignore with sparse-checkouts. There is only a single worktree-global $GIT_DIR/info/sparse-checkout file. As such, paths to files must be specified relative to the toplevel of a repository. Providing suggestions of paths that are relative to the current working directory, as bash completion defaults to, is wrong when the current working directory is not the worktree toplevel directory. Issue #5: Paths with special characters will be interpreted incorrectly The entries in the sparse-checkout file are patterns, not paths. While most paths also qualify as patterns (though even in such cases it would be better for users to not use them directly but prefix them with a leading '/'), there are a variety of special characters that would need special escaping beyond the normal shell escaping: '*', '?', '\', '[', ']', and any leading '#' or '!'. If completion suggests any such paths, users will likely expect them to be treated as an exact path rather than as a pattern that might match some number of files other than 1. However, despite the first four issues, we can note that _if_ users are using tab completion, then they are probably trying to specify a path in the index. As such, we transform their argument into a top-level-rooted pattern that matches such a file. For example, if they type: git sparse-checkout add Make<TAB> we could "complete" to git sparse-checkout add /Makefile or, if they ran from the Documentation/technical/ subdirectory: git sparse-checkout add m<TAB> we could "complete" it to: git sparse-checkout add /Documentation/technical/multi-pack-index.txt Note in both cases I use "complete" in quotes, because we actually add characters both before and after the argument in question, so we are kind of abusing "bash completions" to be "bash completions AND beginnings". The fifth issue is a bit stickier, especially when you consider that we not only need to deal with escaping issues because of special meanings of patterns in sparse-checkout & gitignore files, but also that we need to consider escaping issues due to ls-files needing to sometimes quote or escape characters, and because the shell needs to escape some characters. The multiple interacting forms of escaping could get ugly; this patch makes no attempt to do so and simply documents that we decided to not deal with those corner cases for now but at least get the common cases right. Signed-off-by: Elijah Newren <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>
derrickstolee
pushed a commit
that referenced
this pull request
Apr 4, 2024
When reusing objects from a pack, we keep track of a set of one or more `reused_chunk`s, corresponding to sections of one or more object(s) from a source pack that we are reusing. Each chunk contains two pieces of information: - the offset of the first object in the source pack (relative to the beginning of the source pack) - the difference between that offset, and the corresponding offset in the pack we're generating The purpose of keeping track of these is so that we can patch an OFS_DELTAs that cross over a section of the reuse pack that we didn't take. For instance, consider a hypothetical pack as shown below: (chunk #2) __________... / / +--------+---------+-------------------+---------+ ... | <base> | <other> | (unused) | <delta> | ... +--------+---------+-------------------+---------+ \ / \______________/ (chunk #1) Suppose that we are sending objects "base", "other", and "delta", and that the "delta" object is stored as an OFS_DELTA, and that its base is "base". If we don't send any objects in the "(unused)" range, we can't copy the delta'd object directly, since its delta offset includes a range of the pack that we didn't copy, so we have to account for that difference when patching and reassembling the delta. In order to compute this value correctly, we need to know not only where we are in the packfile we're assembling (with `hashfile_total(f)`) but also the position of the first byte of the packfile that we are currently reusing. Currently, this works just fine, since when reusing only a single pack those two values are always identical (because verbatim reuse is the first thing pack-objects does when enabled after writing the pack header). But when reusing multiple packs which have one or more gaps, we'll need to account for these two values diverging. Together, these two allow us to compute the reused chunk's offset difference relative to the start of the reused pack, as desired. Helped-by: Jeff King <[email protected]> Signed-off-by: Taylor Blau <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>
derrickstolee
pushed a commit
that referenced
this pull request
Apr 4, 2024
The t5309 script triggers a racy false positive with SANITIZE=leak on a multi-core system. Running with "--stress --run=6" usually fails within 10 seconds or so for me, complaining with something like: + git index-pack --fix-thin --stdin fatal: REF_DELTA at offset 46 already resolved (duplicate base 01d7713666f4de822776c7622c10f1b07de280dc?) ================================================================= ==3904583==ERROR: LeakSanitizer: detected memory leaks Direct leak of 32 byte(s) in 1 object(s) allocated from: #0 0x7fa790d01986 in __interceptor_realloc ../../../../src/libsanitizer/lsan/lsan_interceptors.cpp:98 #1 0x7fa790add769 in __pthread_getattr_np nptl/pthread_getattr_np.c:180 #2 0x7fa790d117c5 in __sanitizer::GetThreadStackTopAndBottom(bool, unsigned long*, unsigned long*) ../../../../src/libsanitizer/sanitizer_common/sanitizer_linux_libcdep.cpp:150 #3 0x7fa790d11957 in __sanitizer::GetThreadStackAndTls(bool, unsigned long*, unsigned long*, unsigned long*, unsigned long*) ../../../../src/libsanitizer/sanitizer_common/sanitizer_linux_libcdep.cpp:598 #4 0x7fa790d03fe8 in __lsan::ThreadStart(unsigned int, unsigned long long, __sanitizer::ThreadType) ../../../../src/libsanitizer/lsan/lsan_posix.cpp:51 #5 0x7fa790d013fd in __lsan_thread_start_func ../../../../src/libsanitizer/lsan/lsan_interceptors.cpp:440 #6 0x7fa790adc3eb in start_thread nptl/pthread_create.c:444 #7 0x7fa790b5ca5b in clone3 ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81 SUMMARY: LeakSanitizer: 32 byte(s) leaked in 1 allocation(s). Aborted What happens is this: 0. We construct a bogus pack with a duplicate object in it and trigger index-pack. 1. We spawn a bunch of worker threads to resolve deltas (on my system it is 16 threads). 2. One of the threads sees the duplicate object and bails by calling exit(), taking down all of the threads. This is expected and is the point of the test. 3. At the time exit() is called, we may still be spawning threads from the main process via pthread_create(). LSan hooks thread creation to update its book-keeping; it has to know where each thread's stack is (so it can find entry points for reachable memory). So it calls pthread_getattr_np() to get information about the new thread. That may allocate memory that must be freed with a matching call to pthread_attr_destroy(). Probably LSan does that immediately, but if you're unlucky enough, the exit() will happen while it's between those two calls, and the allocated pthread_attr_t appears as a leak. This isn't a real leak. It's not even in our code, but rather in the LSan instrumentation code. So we could just ignore it. But the false positive can cause people to waste time tracking it down. It's possibly something that LSan could protect against (e.g., cover the getattr/destroy pair with a mutex, and then in the final post-exit() check for leaks try to take the same mutex). But I don't know enough about LSan to say if that's a reasonable approach or not (or if my analysis is even completely correct). In the meantime, it's pretty easy to avoid the race by making creation of the worker threads "atomic". That is, we'll spawn all of them before letting any of them start to work. That's easy to do because we already have a work_lock() mutex for handing out that work. If the main process takes it, then all of the threads will immediately block until we've finished spawning and released it. This shouldn't make any practical difference for non-LSan runs. The thread spawning is quick, and could happen before any worker thread gets scheduled anyway. Probably other spots that use threads are subject to the same issues. But since we have to manually insert locking (and since this really is kind of a hack), let's not bother with them unless somebody experiences a similar racy false-positive in practice. Signed-off-by: Jeff King <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>
derrickstolee
pushed a commit
that referenced
this pull request
Apr 30, 2024
derrickstolee
pushed a commit
that referenced
this pull request
May 31, 2024
derrickstolee
pushed a commit
that referenced
this pull request
Jun 19, 2024
derrickstolee
pushed a commit
that referenced
this pull request
Jul 3, 2024
Commit d6a8c58 (midx-write.c: support reading an existing MIDX with `packs_to_include`, 2024-05-29) changed the MIDX generation machinery to support reading from an existing MIDX when writing a new one. Unfortunately, the rest of the MIDX generation machinery is not prepared to deal with such a change. For instance, the function responsible for adding to the object ID fanout table from a MIDX source (midx_fanout_add_midx_fanout()) will gladly add objects from an existing MIDX for some fanout level regardless of whether or not those objects came from packs that are to be included in the subsequent MIDX write. This results in broken pseudo-pack object order (leading to incorrect object traversal results) and segmentation faults, like so (generated by running the added test prior to the changes in midx-write.c): #0 0x000055ee31393f47 in midx_pack_order (ctx=0x7ffdde205c70) at midx-write.c:590 #1 0x000055ee31395a69 in write_midx_internal (object_dir=0x55ee32570440 ".git/objects", packs_to_include=0x7ffdde205e20, packs_to_drop=0x0, preferred_pack_name=0x0, refs_snapshot=0x0, flags=15) at midx-write.c:1171 #2 0x000055ee31395f38 in write_midx_file_only (object_dir=0x55ee32570440 ".git/objects", packs_to_include=0x7ffdde205e20, preferred_pack_name=0x0, refs_snapshot=0x0, flags=15) at midx-write.c:1274 [...] In stack frame #0, the code on midx-write.c:590 is using the new pack ID corresponding to some object which was added from the existing MIDX. Importantly, the pack from which that object was selected in the existing MIDX does not appear in the new MIDX as it was excluded via `--stdin-packs`. In this instance, the pack in question had pack ID "1" in the existing MIDX, but since it was excluded from the new MIDX, we never filled in that entry in the pack_perm table, resulting in: (gdb) p *ctx->pack_perm@2 $1 = {0, 1515870810} Which is what causes the segfault above when we try and read: struct pack_info *pack = &ctx->info[ctx->pack_perm[i]]; if (pack->bitmap_pos == BITMAP_POS_UNKNOWN) pack->bitmap_pos = 0; Fundamentally, we should be able to read information from an existing MIDX when generating a new one. But in practice the midx-write.c code assumes that we won't run into issues like the above with incongruent pack IDs, and often makes those assumptions in extremely subtle and fragile ways. Instead, let's avoid reading from an existing MIDX altogether, and stick with the pre-d6a8c58675 implementation. Harden against any regressions in this area by adding a test which demonstrates these issues. Signed-off-by: Taylor Blau <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>
derrickstolee
pushed a commit
that referenced
this pull request
Jul 3, 2024
When performing multi-pack reuse, reuse_partial_packfile_from_bitmap() is responsible for generating an array of bitmapped_pack structs from which to perform reuse. In the multi-pack case, we loop over the MIDXs packs and copy the result of calling `nth_bitmapped_pack()` to construct the list of reusable paths. But we may also want to do pack-reuse over a single pack, either because we only had one pack to perform reuse over (in the case of single-pack bitmaps), or because we explicitly asked to do single pack reuse even with a MIDX[^1]. When this is the case, the array we generate of reusable packs contains only a single element, which is either (a) the pack attached to the single-pack bitmap, or (b) the MIDX's preferred pack. In 795006f (pack-bitmap: gracefully handle missing BTMP chunks, 2024-04-15), we refactored the reuse_partial_packfile_from_bitmap() function and stopped assigning the pack_int_id field when reusing only the MIDX's preferred pack. This results in an uninitialized read down in try_partial_reuse() like so: ==7474==WARNING: MemorySanitizer: use-of-uninitialized-value #0 0x55c5cd191dde in try_partial_reuse pack-bitmap.c:1887:8 #1 0x55c5cd191dde in reuse_partial_packfile_from_bitmap_1 pack-bitmap.c:2001:8 #2 0x55c5cd191dde in reuse_partial_packfile_from_bitmap pack-bitmap.c:2105:3 #3 0x55c5cce0bd0e in get_object_list_from_bitmap builtin/pack-objects.c:4043:3 #4 0x55c5cce0bd0e in get_object_list builtin/pack-objects.c:4156:27 #5 0x55c5cce0bd0e in cmd_pack_objects builtin/pack-objects.c:4596:3 #6 0x55c5ccc8fac8 in run_builtin git.c:474:11 which happens when try_partial_reuse() tries to call midx_pair_to_pack_pos() when it tries to reject cross-pack deltas. Avoid the uninitialized read by ensuring that the pack_int_id field is set in the single-pack reuse case by setting it to either the MIDX preferred pack's pack_int_id, or '-1', in the case of single-pack bitmaps. In the latter case, we never read the pack_int_id field, so the choice of '-1' is intentional as a "garbage in, garbage out" measure. Guard against further regressions in this area by adding a test which ensures that we do not throw out deltas from the preferred pack as "cross-pack" due to an uninitialized pack_int_id. [^1]: This can happen for a couple of reasons, either because the repository is configured with 'pack.allowPackReuse=(true|single)', or because the MIDX was generated prior to the introduction of the BTMP chunk, which contains information necessary to perform multi-pack reuse. Reported-by: Kyle Lippincott <[email protected]> Signed-off-by: Taylor Blau <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>
derrickstolee
pushed a commit
that referenced
this pull request
Jul 3, 2024
Memory sanitizer (msan) is detecting a use of an uninitialized variable (`size`) in `read_attr_from_index`: ==2268==WARNING: MemorySanitizer: use-of-uninitialized-value #0 0x5651f3416504 in read_attr_from_index git/attr.c:868:11 #1 0x5651f3415530 in read_attr git/attr.c #2 0x5651f3413d74 in bootstrap_attr_stack git/attr.c:968:6 #3 0x5651f3413d74 in prepare_attr_stack git/attr.c:1004:2 #4 0x5651f3413d74 in collect_some_attrs git/attr.c:1199:2 #5 0x5651f3413144 in git_check_attr git/attr.c:1345:2 #6 0x5651f34728da in convert_attrs git/convert.c:1320:2 #7 0x5651f3473425 in would_convert_to_git_filter_fd git/convert.c:1373:2 #8 0x5651f357a35e in index_fd git/object-file.c:2630:34 #9 0x5651f357aa15 in index_path git/object-file.c:2657:7 #10 0x5651f35db9d9 in add_to_index git/read-cache.c:766:7 #11 0x5651f35dc170 in add_file_to_index git/read-cache.c:799:9 #12 0x5651f321f9b2 in add_files git/builtin/add.c:346:7 #13 0x5651f321f9b2 in cmd_add git/builtin/add.c:565:18 #14 0x5651f321d327 in run_builtin git/git.c:474:11 #15 0x5651f321bc9e in handle_builtin git/git.c:729:3 #16 0x5651f321a792 in run_argv git/git.c:793:4 #17 0x5651f321a792 in cmd_main git/git.c:928:19 #18 0x5651f33dde1f in main git/common-main.c:62:11 The issue exists because `size` is an output parameter from `read_blob_data_from_index`, but it's only modified if `read_blob_data_from_index` returns non-NULL. The read of `size` when calling `read_attr_from_buf` unconditionally may read from an uninitialized value. `read_attr_from_buf` checks that `buf` is non-NULL before reading from `size`, but by then it's already too late: the uninitialized read will have happened already. Furthermore, there's no guarantee that the compiler won't reorder things so that it checks `size` before checking `!buf`. Make the call to `read_attr_from_buf` conditional on `buf` being non-NULL, ensuring that `size` is not read if it's never set. Signed-off-by: Kyle Lippincott <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>
derrickstolee
pushed a commit
that referenced
this pull request
Jul 19, 2024
derrickstolee
pushed a commit
that referenced
this pull request
Aug 12, 2024
Signed-off-by: Junio C Hamano <[email protected]>
derrickstolee
pushed a commit
that referenced
this pull request
Aug 20, 2024
When using `--fixed-value` with a key whose value is left empty (implied as being "true"), 'git config' may crash when invoked like either of: $ git config set --file=config --value=value --fixed-value \ section.key pattern $ git config --file=config --fixed-value section.key value pattern The original bugreport[1] bisects to 00bbdde (builtin/config: introduce "set" subcommand, 2024-05-06), which is a red-herring, since the original bugreport uses the new 'git config set' invocation. The behavior likely bisects back to c90702a (config: plumb --fixed-value into config API, 2020-11-25), which introduces the new --fixed-value option in the first place. Looking at the relevant frame from a failed process's coredump, the crash appears in config.c::matches() like so: (gdb) up #1 0x000055b3e8b06022 in matches (key=0x55b3ea894360 "section.key", value=0x0, store=0x7ffe99076eb0) at config.c:2884 2884 return !strcmp(store->fixed_value, value); where we are trying to compare the `--fixed-value` argument to `value`, which is NULL. Avoid attempting to match `--fixed-value` for configuration keys with no explicit value. A future patch could consider the empty value to mean "true", "yes", "on", etc. when invoked with `--type=bool`, but let's punt on that for now in the name of avoiding the segfault. [1]: https://lore.kernel.org/git/CANrWfmTek1xErBLrnoyhHN+gWU+rw14y6SQ+abZyzGoaBjmiKA@mail.gmail.com/ Reported-by: Han Jiang <[email protected]> Signed-off-by: Taylor Blau <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>
derrickstolee
pushed a commit
that referenced
this pull request
Aug 23, 2024
derrickstolee
pushed a commit
that referenced
this pull request
Sep 5, 2024
It was recently reported that concurrent reads and writes may cause the reftable backend to segfault. The root cause of this is that we do not properly keep track of reftable readers across reloads. Suppose that you have a reftable iterator and then decide to reload the stack while iterating through the iterator. When the stack has been rewritten since we have created the iterator, then we would end up discarding a subset of readers that may still be in use by the iterator. The consequence is that we now try to reference deallocated memory, which of course segfaults. One way to trigger this is in t5616, where some background maintenance jobs have been leaking from one test into another. This leads to stack traces like the following one: + git -c protocol.version=0 -C pc1 fetch --filter=blob:limit=29999 --refetch origin AddressSanitizer:DEADLYSIGNAL ================================================================= ==657994==ERROR: AddressSanitizer: SEGV on unknown address 0x7fa0f0ec6089 (pc 0x55f23e52ddf9 bp 0x7ffe7bfa1700 sp 0x7ffe7bfa1700 T0) ==657994==The signal is caused by a READ memory access. #0 0x55f23e52ddf9 in get_var_int reftable/record.c:29 #1 0x55f23e53295e in reftable_decode_keylen reftable/record.c:170 #2 0x55f23e532cc0 in reftable_decode_key reftable/record.c:194 #3 0x55f23e54e72e in block_iter_next reftable/block.c:398 #4 0x55f23e5573dc in table_iter_next_in_block reftable/reader.c:240 #5 0x55f23e5573dc in table_iter_next reftable/reader.c:355 #6 0x55f23e5573dc in table_iter_next reftable/reader.c:339 #7 0x55f23e551283 in merged_iter_advance_subiter reftable/merged.c:69 #8 0x55f23e55169e in merged_iter_next_entry reftable/merged.c:123 #9 0x55f23e55169e in merged_iter_next_void reftable/merged.c:172 #10 0x55f23e537625 in reftable_iterator_next_ref reftable/generic.c:175 #11 0x55f23e2cf9c6 in reftable_ref_iterator_advance refs/reftable-backend.c:464 #12 0x55f23e2d996e in ref_iterator_advance refs/iterator.c:13 #13 0x55f23e2d996e in do_for_each_ref_iterator refs/iterator.c:452 #14 0x55f23dca6767 in get_ref_map builtin/fetch.c:623 #15 0x55f23dca6767 in do_fetch builtin/fetch.c:1659 #16 0x55f23dca6767 in fetch_one builtin/fetch.c:2133 #17 0x55f23dca6767 in cmd_fetch builtin/fetch.c:2432 #18 0x55f23dba7764 in run_builtin git.c:484 #19 0x55f23dba7764 in handle_builtin git.c:741 #20 0x55f23dbab61e in run_argv git.c:805 #21 0x55f23dbab61e in cmd_main git.c:1000 #22 0x55f23dba4781 in main common-main.c:64 #23 0x7fa0f063fc89 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58 #24 0x7fa0f063fd44 in __libc_start_main_impl ../csu/libc-start.c:360 #25 0x55f23dba6ad0 in _start (git+0xadfad0) (BuildId: 803b2b7f59beb03d7849fb8294a8e2145dd4aa27) While it is somewhat awkward that the maintenance processes survive tests in the first place, it is totally expected that reftables should work alright with concurrent writers. Seemingly they don't. The only underlying resource that we need to care about in this context is the reftable reader, which is responsible for reading a single table from disk. These readers get discarded immediately (unless reused) when calling `reftable_stack_reload()`, which is wrong. We can only close them once we know that there are no iterators using them anymore. Prepare for a fix by converting the reftable readers to be refcounted. Reported-by: Jeff King <[email protected]> Signed-off-by: Patrick Steinhardt <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>
derrickstolee
pushed a commit
that referenced
this pull request
Sep 30, 2024
derrickstolee
pushed a commit
that referenced
this pull request
Oct 9, 2024
derrickstolee
pushed a commit
that referenced
this pull request
Oct 9, 2024
An internal customer reported a segfault when running `git sparse-checkout set` with the `index.sparse` config enabled. I was unable to reproduce it locally, but with their help we debugged into the failing process and discovered the following stacktrace: ``` #0 0x00007ff6318fb7b0 in rehash (map=0x3dfb00d0440, newsize=1048576) at hashmap.c:125 #1 0x00007ff6318fbc66 in hashmap_add (map=0x3dfb00d0440, entry=0x3dfb5c58bc8) at hashmap.c:247 #2 0x00007ff631937a70 in hash_index_entry (istate=0x3dfb00d0400, ce=0x3dfb5c58bc8) at name-hash.c:122 #3 0x00007ff631938a2f in add_name_hash (istate=0x3dfb00d0400, ce=0x3dfb5c58bc8) at name-hash.c:638 #4 0x00007ff631a064de in set_index_entry (istate=0x3dfb00d0400, nr=8291, ce=0x3dfb5c58bc8) at sparse-index.c:255 #5 0x00007ff631a06692 in add_path_to_index (oid=0x5ff130, base=0x5ff580, path=0x3dfb4b725da "<redacted>", mode=33188, context=0x5ff570) at sparse-index.c:307 #6 0x00007ff631a3b48c in read_tree_at (r=0x7ff631c026a0 <the_repo>, tree=0x3dfb5b41f60, base=0x5ff580, depth=2, pathspec=0x5ff5a0, fn=0x7ff631a064e5 <add_path_to_index>, context=0x5ff570) at tree.c:46 #7 0x00007ff631a3b60b in read_tree_at (r=0x7ff631c026a0 <the_repo>, tree=0x3dfb5b41e80, base=0x5ff580, depth=1, pathspec=0x5ff5a0, fn=0x7ff631a064e5 <add_path_to_index>, context=0x5ff570) at tree.c:80 #8 0x00007ff631a3b60b in read_tree_at (r=0x7ff631c026a0 <the_repo>, tree=0x3dfb5b41ac8, base=0x5ff580, depth=0, pathspec=0x5ff5a0, fn=0x7ff631a064e5 <add_path_to_index>, context=0x5ff570) at tree.c:80 #9 0x00007ff631a06a95 in expand_index (istate=0x3dfb00d0100, pl=0x0) at sparse-index.c:422 #10 0x00007ff631a06cbd in ensure_full_index (istate=0x3dfb00d0100) at sparse-index.c:456 #11 0x00007ff631990d08 in index_name_stage_pos (istate=0x3dfb00d0100, name=0x3dfb0020080 "algorithm/levenshtein", namelen=21, stage=0, search_mode=EXPAND_SPARSE) at read-cache.c:556 #12 0x00007ff631990d6c in index_name_pos (istate=0x3dfb00d0100, name=0x3dfb0020080 "algorithm/levenshtein", namelen=21) at read-cache.c:566 #13 0x00007ff63180dbb5 in sanitize_paths (argc=185, argv=0x3dfb0030018, prefix=0x0, skip_checks=0) at builtin/sparse-checkout.c:756 #14 0x00007ff63180de50 in sparse_checkout_set (argc=185, argv=0x3dfb0030018, prefix=0x0) at builtin/sparse-checkout.c:860 #15 0x00007ff63180e6c5 in cmd_sparse_checkout (argc=186, argv=0x3dfb0030018, prefix=0x0) at builtin/sparse-checkout.c:1063 #16 0x00007ff6317234cb in run_builtin (p=0x7ff631ad9b38 <commands+2808>, argc=187, argv=0x3dfb0030018) at git.c:548 #17 0x00007ff6317239c0 in handle_builtin (argc=187, argv=0x3dfb0030018) at git.c:808 #18 0x00007ff631723c7d in run_argv (argcp=0x5ffdd0, argv=0x5ffd78) at git.c:877 #19 0x00007ff6317241d1 in cmd_main (argc=187, argv=0x3dfb0030018) at git.c:1017 #20 0x00007ff631838b60 in main (argc=190, argv=0x3dfb0030000) at common-main.c:64 ``` The very bottom of the stack being the `rehash()` method from `hashmap.c` as called within the `name-hash` API made me look at where these hashmaps were being used in the sparse index logic. These were being copied across indexes, which seems dangerous. Indeed, clearing these hashmaps and setting them as not initialized fixes the segfault. The second commit is a response to a test failure that happens in `t1092-sparse-checkout-compatibility.sh` where `git stash pop` starts to fail because the underlying `git checkout-index` process fails due to colliding files. Passing the `-f` flag appears to work, but it's unclear why this name-hash change causes that change in behavior.
dscho
added a commit
that referenced
this pull request
Dec 10, 2024
dscho
added a commit
that referenced
this pull request
Dec 10, 2024
An internal customer reported a segfault when running `git sparse-checkout set` with the `index.sparse` config enabled. I was unable to reproduce it locally, but with their help we debugged into the failing process and discovered the following stacktrace: ``` #0 0x00007ff6318fb7b0 in rehash (map=0x3dfb00d0440, newsize=1048576) at hashmap.c:125 #1 0x00007ff6318fbc66 in hashmap_add (map=0x3dfb00d0440, entry=0x3dfb5c58bc8) at hashmap.c:247 #2 0x00007ff631937a70 in hash_index_entry (istate=0x3dfb00d0400, ce=0x3dfb5c58bc8) at name-hash.c:122 #3 0x00007ff631938a2f in add_name_hash (istate=0x3dfb00d0400, ce=0x3dfb5c58bc8) at name-hash.c:638 #4 0x00007ff631a064de in set_index_entry (istate=0x3dfb00d0400, nr=8291, ce=0x3dfb5c58bc8) at sparse-index.c:255 #5 0x00007ff631a06692 in add_path_to_index (oid=0x5ff130, base=0x5ff580, path=0x3dfb4b725da "<redacted>", mode=33188, context=0x5ff570) at sparse-index.c:307 #6 0x00007ff631a3b48c in read_tree_at (r=0x7ff631c026a0 <the_repo>, tree=0x3dfb5b41f60, base=0x5ff580, depth=2, pathspec=0x5ff5a0, fn=0x7ff631a064e5 <add_path_to_index>, context=0x5ff570) at tree.c:46 #7 0x00007ff631a3b60b in read_tree_at (r=0x7ff631c026a0 <the_repo>, tree=0x3dfb5b41e80, base=0x5ff580, depth=1, pathspec=0x5ff5a0, fn=0x7ff631a064e5 <add_path_to_index>, context=0x5ff570) at tree.c:80 #8 0x00007ff631a3b60b in read_tree_at (r=0x7ff631c026a0 <the_repo>, tree=0x3dfb5b41ac8, base=0x5ff580, depth=0, pathspec=0x5ff5a0, fn=0x7ff631a064e5 <add_path_to_index>, context=0x5ff570) at tree.c:80 #9 0x00007ff631a06a95 in expand_index (istate=0x3dfb00d0100, pl=0x0) at sparse-index.c:422 #10 0x00007ff631a06cbd in ensure_full_index (istate=0x3dfb00d0100) at sparse-index.c:456 #11 0x00007ff631990d08 in index_name_stage_pos (istate=0x3dfb00d0100, name=0x3dfb0020080 "algorithm/levenshtein", namelen=21, stage=0, search_mode=EXPAND_SPARSE) at read-cache.c:556 #12 0x00007ff631990d6c in index_name_pos (istate=0x3dfb00d0100, name=0x3dfb0020080 "algorithm/levenshtein", namelen=21) at read-cache.c:566 #13 0x00007ff63180dbb5 in sanitize_paths (argc=185, argv=0x3dfb0030018, prefix=0x0, skip_checks=0) at builtin/sparse-checkout.c:756 #14 0x00007ff63180de50 in sparse_checkout_set (argc=185, argv=0x3dfb0030018, prefix=0x0) at builtin/sparse-checkout.c:860 #15 0x00007ff63180e6c5 in cmd_sparse_checkout (argc=186, argv=0x3dfb0030018, prefix=0x0) at builtin/sparse-checkout.c:1063 #16 0x00007ff6317234cb in run_builtin (p=0x7ff631ad9b38 <commands+2808>, argc=187, argv=0x3dfb0030018) at git.c:548 #17 0x00007ff6317239c0 in handle_builtin (argc=187, argv=0x3dfb0030018) at git.c:808 #18 0x00007ff631723c7d in run_argv (argcp=0x5ffdd0, argv=0x5ffd78) at git.c:877 #19 0x00007ff6317241d1 in cmd_main (argc=187, argv=0x3dfb0030018) at git.c:1017 #20 0x00007ff631838b60 in main (argc=190, argv=0x3dfb0030000) at common-main.c:64 ``` The very bottom of the stack being the `rehash()` method from `hashmap.c` as called within the `name-hash` API made me look at where these hashmaps were being used in the sparse index logic. These were being copied across indexes, which seems dangerous. Indeed, clearing these hashmaps and setting them as not initialized fixes the segfault. The second commit is a response to a test failure that happens in `t1092-sparse-checkout-compatibility.sh` where `git stash pop` starts to fail because the underlying `git checkout-index` process fails due to colliding files. Passing the `-f` flag appears to work, but it's unclear why this name-hash change causes that change in behavior.
derrickstolee
pushed a commit
that referenced
this pull request
Jan 6, 2025
This one is a little bit more curious. In t6112, we have a test that exercises the `git rev-list --filter` option with invalid filters. We execute git-rev-list(1) via `test_must_fail`, which means that we check for leaks even though Git exits with an error code. This causes the following leak: Direct leak of 27 byte(s) in 1 object(s) allocated from: #0 0x5555555e6946 in realloc.part.0 lsan_interceptors.cpp.o #1 0x5555558fb4b6 in xrealloc wrapper.c:137:8 #2 0x5555558b6e06 in strbuf_grow strbuf.c:112:2 #3 0x5555558b7550 in strbuf_add strbuf.c:311:2 #4 0x5555557c1a88 in strbuf_addstr strbuf.h:310:2 #5 0x5555557c1d4c in parse_list_objects_filter list-objects-filter-options.c:261:3 #6 0x555555885ead in handle_revision_pseudo_opt revision.c:2899:3 #7 0x555555884e20 in setup_revisions revision.c:3014:11 #8 0x5555556c4b42 in cmd_rev_list builtin/rev-list.c:588:9 #9 0x5555555ec5e3 in run_builtin git.c:483:11 #10 0x5555555eb1e4 in handle_builtin git.c:749:13 #11 0x5555555ec001 in run_argv git.c:819:4 #12 0x5555555eaf94 in cmd_main git.c:954:19 #13 0x5555556fd569 in main common-main.c:64:11 #14 0x7ffff7ca714d in __libc_start_call_main (.../lib/libc.so.6+0x2a14d) #15 0x7ffff7ca7208 in __libc_start_main@GLIBC_2.2.5 (.../libc.so.6+0x2a208) #16 0x5555555ad064 in _start (git+0x59064) This leak is valid, as we call `die()` and do not clean up the memory at all. But what's curious is that this is the only leak reported, because we don't clean up any other allocated memory, either, and I have no idea why the leak sanitizer treats this buffer specially. In any case, we can work around the leak by shuffling things around a bit. Instead of calling `gently_parse_list_objects_filter()` and dying after we have modified the filter spec, we simply do so beforehand. Like this we don't allocate the buffer in the error case, which makes the reported leak go away. It's not pretty, but it manages to make t6112 leak free. Signed-off-by: Patrick Steinhardt <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>
derrickstolee
pushed a commit
that referenced
this pull request
Jan 6, 2025
When running t5601 with the leak checker enabled we can see a hang in our CI systems. This hang seems to be system-specific, as I cannot reproduce it on my own machine. As it turns out, the issue is in those testcases that exercise cloning of `~repo`-style paths. All of the testcases that hang eventually end up interpreting "repo" as the username and will call getpwnam(3p) with that username. That should of course be fine, and getpwnam(3p) should just return an error. But instead, the leak sanitizer seems to be recursing while handling a call to `free()` in the NSS modules: #0 0x00007ffff7fd98d5 in _dl_update_slotinfo (req_modid=1, new_gen=2) at ../elf/dl-tls.c:720 #1 0x00007ffff7fd9ac4 in update_get_addr (ti=0x7ffff7a91d80, gen=<optimized out>) at ../elf/dl-tls.c:916 #2 0x00007ffff7fdc85c in __tls_get_addr () at ../sysdeps/x86_64/tls_get_addr.S:55 #3 0x00007ffff7a27e04 in __lsan::GetAllocatorCache () at ../../../../src/libsanitizer/lsan/lsan_linux.cpp:27 #4 0x00007ffff7a2b33a in __lsan::Deallocate (p=0x0) at ../../../../src/libsanitizer/lsan/lsan_allocator.cpp:127 #5 __lsan::lsan_free (p=0x0) at ../../../../src/libsanitizer/lsan/lsan_allocator.cpp:220 ... #261505 0x00007ffff7fd99f2 in free (ptr=<optimized out>) at ../include/rtld-malloc.h:50 #261506 _dl_update_slotinfo (req_modid=1, new_gen=2) at ../elf/dl-tls.c:822 #261507 0x00007ffff7fd9ac4 in update_get_addr (ti=0x7ffff7a91d80, gen=<optimized out>) at ../elf/dl-tls.c:916 #261508 0x00007ffff7fdc85c in __tls_get_addr () at ../sysdeps/x86_64/tls_get_addr.S:55 #261509 0x00007ffff7a27e04 in __lsan::GetAllocatorCache () at ../../../../src/libsanitizer/lsan/lsan_linux.cpp:27 #261510 0x00007ffff7a2b33a in __lsan::Deallocate (p=0x5020000001e0) at ../../../../src/libsanitizer/lsan/lsan_allocator.cpp:127 #261511 __lsan::lsan_free (p=0x5020000001e0) at ../../../../src/libsanitizer/lsan/lsan_allocator.cpp:220 #261512 0x00007ffff793da25 in module_load (module=0x515000000280) at ./nss/nss_module.c:188 #261513 0x00007ffff793dee5 in __nss_module_load (module=0x515000000280) at ./nss/nss_module.c:302 #261514 __nss_module_get_function (module=0x515000000280, name=name@entry=0x7ffff79b9128 "getpwnam_r") at ./nss/nss_module.c:328 #261515 0x00007ffff793e741 in __GI___nss_lookup_function (fct_name=<optimized out>, ni=<optimized out>) at ./nss/nsswitch.c:137 #261516 __GI___nss_next2 (ni=ni@entry=0x7fffffffa458, fct_name=fct_name@entry=0x7ffff79b9128 "getpwnam_r", fct2_name=fct2_name@entry=0x0, fctp=fctp@entry=0x7fffffffa460, status=status@entry=0, all_values=all_values@entry=0) at ./nss/nsswitch.c:120 #261517 0x00007ffff794c6a7 in __getpwnam_r (name=name@entry=0x501000000060 "repo", resbuf=resbuf@entry=0x7ffff79fb320 <resbuf>, buffer=<optimized out>, buflen=buflen@entry=1024, result=result@entry=0x7fffffffa4b0) at ../nss/getXXbyYY_r.c:343 #261518 0x00007ffff794c4d8 in getpwnam (name=0x501000000060 "repo") at ../nss/getXXbyYY.c:140 #261519 0x00005555557e37ff in getpw_str (username=0x5020000001a1 "repo", len=4) at path.c:613 #261520 0x00005555557e3937 in interpolate_path (path=0x5020000001a0 "~repo", real_home=0) at path.c:654 #261521 0x00005555557e3aea in enter_repo (path=0x501000000040 "~repo", strict=0) at path.c:718 #261522 0x000055555568f0ba in cmd_upload_pack (argc=1, argv=0x502000000100, prefix=0x0, repo=0x0) at builtin/upload-pack.c:57 #261523 0x0000555555575ba8 in run_builtin (p=0x555555a20c98 <commands+3192>, argc=2, argv=0x502000000100, repo=0x555555a53b20 <the_repo>) at git.c:481 #261524 0x0000555555576067 in handle_builtin (args=0x7fffffffaab0) at git.c:742 #261525 0x000055555557678d in cmd_main (argc=2, argv=0x7fffffffac58) at git.c:912 #261526 0x00005555556963cd in main (argc=2, argv=0x7fffffffac58) at common-main.c:64 Note that this stack is more than 260000 function calls deep. Run under the debugger this will eventually segfault, but in our CI systems it seems like this just hangs forever. I assume that this is a bug either in the leak sanitizer or in glibc, as I cannot reproduce it on my machine. In any case, let's work around the bug for now by marking those tests with the "!SANITIZE_LEAK" prereq. Signed-off-by: Patrick Steinhardt <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>
derrickstolee
pushed a commit
that referenced
this pull request
Jan 6, 2025
There's a race with LSan when spawning threads and one of the threads calls die(). We worked around one such problem with index-pack in the previous commit, but it exists in git-grep, too. You can see it with: make SANITIZE=leak THREAD_BARRIER_PTHREAD=YesOnLinux cd t ./t0003-attributes.sh --stress which fails pretty quickly with: ==git==4096424==ERROR: LeakSanitizer: detected memory leaks Direct leak of 32 byte(s) in 1 object(s) allocated from: #0 0x7f906de14556 in realloc ../../../../src/libsanitizer/lsan/lsan_interceptors.cpp:98 #1 0x7f906dc9d2c1 in __pthread_getattr_np nptl/pthread_getattr_np.c:180 #2 0x7f906de2500d in __sanitizer::GetThreadStackTopAndBottom(bool, unsigned long*, unsigned long*) ../../../../src/libsanitizer/sanitizer_common/sanitizer_linux_libcdep.cpp:150 #3 0x7f906de25187 in __sanitizer::GetThreadStackAndTls(bool, unsigned long*, unsigned long*, unsigned long*, unsigned long*) ../../../../src/libsanitizer/sanitizer_common/sanitizer_linux_libcdep.cpp:614 #4 0x7f906de17d18 in __lsan::ThreadStart(unsigned int, unsigned long long, __sanitizer::ThreadType) ../../../../src/libsanitizer/lsan/lsan_posix.cpp:53 #5 0x7f906de143a9 in ThreadStartFunc<false> ../../../../src/libsanitizer/lsan/lsan_interceptors.cpp:431 #6 0x7f906dc9bf51 in start_thread nptl/pthread_create.c:447 #7 0x7f906dd1a677 in __clone3 ../sysdeps/unix/sysv/linux/x86_64/clone3.S:78 As with the previous commit, we can fix this by inserting a barrier that makes sure all threads have finished their setup before continuing. But there's one twist in this case: the thread which calls die() is not one of the worker threads, but the main thread itself! So we need the main thread to wait in the barrier, too, until all threads have gotten to it. And thus we initialize the barrier for num_threads+1, to account for all of the worker threads plus the main one. If we then test as above, t0003 should run indefinitely. Signed-off-by: Jeff King <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>
derrickstolee
pushed a commit
that referenced
this pull request
Jan 6, 2025
In 1b9e9be (csum-file.c: use unsafe SHA-1 implementation when available, 2024-09-26) we have converted our `struct hashfile` to use the unsafe SHA1 backend, which results in a significant speedup. One needs to be careful with how to use that structure now though because callers need to consistently use either the safe or unsafe variants of SHA1, as otherwise one can easily trigger corruption. As it turns out, we have one inconsistent usage in our tree because we directly initialize `struct hashfile_checkpoint::ctx` with the safe variant of SHA1, but end up writing to that context with the unsafe ones. This went unnoticed so far because our CI systems do not exercise different hash functions for these two backends, and consequently safe and unsafe variants are equivalent. But when using SHA1DC as safe and OpenSSL as unsafe backend this leads to a crash an t1050: ++ git -c core.compression=0 add large1 AddressSanitizer:DEADLYSIGNAL ================================================================= ==1367==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000040 (pc 0x7ffff7a01a99 bp 0x507000000db0 sp 0x7fffffff5690 T0) ==1367==The signal is caused by a READ memory access. ==1367==Hint: address points to the zero page. #0 0x7ffff7a01a99 in EVP_MD_CTX_copy_ex (/nix/store/h1ydpxkw9qhjdxjpic1pdc2nirggyy6f-openssl-3.3.2/lib/libcrypto.so.3+0x201a99) (BuildId: 41746a580d39075fc85e8c8065b6c07fb34e97d4) #1 0x555555ddde56 in openssl_SHA1_Clone ../sha1/openssl.h:40:2 #2 0x555555dce2fc in git_hash_sha1_clone_unsafe ../object-file.c:123:2 #3 0x555555c2d5f8 in hashfile_checkpoint ../csum-file.c:211:2 #4 0x555555b9905d in deflate_blob_to_pack ../bulk-checkin.c:286:4 #5 0x555555b98ae9 in index_blob_bulk_checkin ../bulk-checkin.c:362:15 #6 0x555555ddab62 in index_blob_stream ../object-file.c:2756:9 #7 0x555555dda420 in index_fd ../object-file.c:2778:9 #8 0x555555ddad76 in index_path ../object-file.c:2796:7 #9 0x555555e947f3 in add_to_index ../read-cache.c:771:7 #10 0x555555e954a4 in add_file_to_index ../read-cache.c:804:9 #11 0x5555558b5c39 in add_files ../builtin/add.c:355:7 #12 0x5555558b412e in cmd_add ../builtin/add.c:578:18 #13 0x555555b1f493 in run_builtin ../git.c:480:11 #14 0x555555b1bfef in handle_builtin ../git.c:740:9 #15 0x555555b1e6f4 in run_argv ../git.c:807:4 #16 0x555555b1b87a in cmd_main ../git.c:947:19 #17 0x5555561649e6 in main ../common-main.c:64:11 #18 0x7ffff742a1fb in __libc_start_call_main (/nix/store/65h17wjrrlsj2rj540igylrx7fqcd6vq-glibc-2.40-36/lib/libc.so.6+0x2a1fb) (BuildId: bf320110569c8ec2425e9a0c5e4eb7e97f1fb6e4) #19 0x7ffff742a2b8 in __libc_start_main@GLIBC_2.2.5 (/nix/store/65h17wjrrlsj2rj540igylrx7fqcd6vq-glibc-2.40-36/lib/libc.so.6+0x2a2b8) (BuildId: bf320110569c8ec2425e9a0c5e4eb7e97f1fb6e4) #20 0x555555772c84 in _start (git+0x21ec84) ==1367==Register values: rax = 0x0000511000001080 rbx = 0x0000000000000000 rcx = 0x000000000000000c rdx = 0x0000000000000000 rdi = 0x0000000000000000 rsi = 0x0000507000000db0 rbp = 0x0000507000000db0 rsp = 0x00007fffffff5690 r8 = 0x0000000000000000 r9 = 0x0000000000000000 r10 = 0x0000000000000000 r11 = 0x00007ffff7a01a30 r12 = 0x0000000000000000 r13 = 0x00007fffffff6b38 r14 = 0x00007ffff7ffd000 r15 = 0x00005555563b9910 AddressSanitizer can not provide additional info. SUMMARY: AddressSanitizer: SEGV (/nix/store/h1ydpxkw9qhjdxjpic1pdc2nirggyy6f-openssl-3.3.2/lib/libcrypto.so.3+0x201a99) (BuildId: 41746a580d39075fc85e8c8065b6c07fb34e97d4) in EVP_MD_CTX_copy_ex ==1367==ABORTING ./test-lib.sh: line 1023: 1367 Aborted git $config add large1 error: last command exited with $?=134 not ok 4 - add with -c core.compression=0 Fix the issue by using the unsafe variant instead. Signed-off-by: Patrick Steinhardt <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>
derrickstolee
pushed a commit
that referenced
this pull request
Jan 6, 2025
Same as with the preceding commit, git-fast-import(1) is using the safe variant to initialize a hashfile checkpoint. This leads to a segfault when passing the checkpoint into the hashfile subsystem because it would use the unsafe variants instead: ++ git --git-dir=R/.git fast-import --big-file-threshold=1 AddressSanitizer:DEADLYSIGNAL ================================================================= ==577126==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000040 (pc 0x7ffff7a01a99 bp 0x5070000009c0 sp 0x7fffffff5b30 T0) ==577126==The signal is caused by a READ memory access. ==577126==Hint: address points to the zero page. #0 0x7ffff7a01a99 in EVP_MD_CTX_copy_ex (/nix/store/h1ydpxkw9qhjdxjpic1pdc2nirggyy6f-openssl-3.3.2/lib/libcrypto.so.3+0x201a99) (BuildId: 41746a580d39075fc85e8c8065b6c07fb34e97d4) #1 0x555555ddde56 in openssl_SHA1_Clone ../sha1/openssl.h:40:2 #2 0x555555dce2fc in git_hash_sha1_clone_unsafe ../object-file.c:123:2 #3 0x555555c2d5f8 in hashfile_checkpoint ../csum-file.c:211:2 #4 0x5555559647d1 in stream_blob ../builtin/fast-import.c:1110:2 #5 0x55555596247b in parse_and_store_blob ../builtin/fast-import.c:2031:3 #6 0x555555967f91 in file_change_m ../builtin/fast-import.c:2408:5 #7 0x55555595d8a2 in parse_new_commit ../builtin/fast-import.c:2768:4 #8 0x55555595bb7a in cmd_fast_import ../builtin/fast-import.c:3614:4 #9 0x555555b1f493 in run_builtin ../git.c:480:11 #10 0x555555b1bfef in handle_builtin ../git.c:740:9 #11 0x555555b1e6f4 in run_argv ../git.c:807:4 #12 0x555555b1b87a in cmd_main ../git.c:947:19 #13 0x5555561649e6 in main ../common-main.c:64:11 #14 0x7ffff742a1fb in __libc_start_call_main (/nix/store/65h17wjrrlsj2rj540igylrx7fqcd6vq-glibc-2.40-36/lib/libc.so.6+0x2a1fb) (BuildId: bf320110569c8ec2425e9a0c5e4eb7e97f1fb6e4) #15 0x7ffff742a2b8 in __libc_start_main@GLIBC_2.2.5 (/nix/store/65h17wjrrlsj2rj540igylrx7fqcd6vq-glibc-2.40-36/lib/libc.so.6+0x2a2b8) (BuildId: bf320110569c8ec2425e9a0c5e4eb7e97f1fb6e4) #16 0x555555772c84 in _start (git+0x21ec84) ==577126==Register values: rax = 0x0000511000000cc0 rbx = 0x0000000000000000 rcx = 0x000000000000000c rdx = 0x0000000000000000 rdi = 0x0000000000000000 rsi = 0x00005070000009c0 rbp = 0x00005070000009c0 rsp = 0x00007fffffff5b30 r8 = 0x0000000000000000 r9 = 0x0000000000000000 r10 = 0x0000000000000000 r11 = 0x00007ffff7a01a30 r12 = 0x0000000000000000 r13 = 0x00007fffffff6b60 r14 = 0x00007ffff7ffd000 r15 = 0x00005555563b9910 AddressSanitizer can not provide additional info. SUMMARY: AddressSanitizer: SEGV (/nix/store/h1ydpxkw9qhjdxjpic1pdc2nirggyy6f-openssl-3.3.2/lib/libcrypto.so.3+0x201a99) (BuildId: 41746a580d39075fc85e8c8065b6c07fb34e97d4) in EVP_MD_CTX_copy_ex ==577126==ABORTING ./test-lib.sh: line 1039: 577126 Aborted git --git-dir=R/.git fast-import --big-file-threshold=1 < input error: last command exited with $?=134 not ok 167 - R: blob bigger than threshold The segfault is only exposed in case the unsafe and safe backends are different from one another. Fix the issue by initializing the context with the unsafe SHA1 variant. Signed-off-by: Patrick Steinhardt <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>
derrickstolee
pushed a commit
that referenced
this pull request
Jan 6, 2025
Our CI jobs sometimes see false positive leaks like this: ================================================================= ==3904583==ERROR: LeakSanitizer: detected memory leaks Direct leak of 32 byte(s) in 1 object(s) allocated from: #0 0x7fa790d01986 in __interceptor_realloc ../../../../src/libsanitizer/lsan/lsan_interceptors.cpp:98 #1 0x7fa790add769 in __pthread_getattr_np nptl/pthread_getattr_np.c:180 #2 0x7fa790d117c5 in __sanitizer::GetThreadStackTopAndBottom(bool, unsigned long*, unsigned long*) ../../../../src/libsanitizer/sanitizer_common/sanitizer_linux_libcdep.cpp:150 #3 0x7fa790d11957 in __sanitizer::GetThreadStackAndTls(bool, unsigned long*, unsigned long*, unsigned long*, unsigned long*) ../../../../src/libsanitizer/sanitizer_common/sanitizer_linux_libcdep.cpp:598 #4 0x7fa790d03fe8 in __lsan::ThreadStart(unsigned int, unsigned long long, __sanitizer::ThreadType) ../../../../src/libsanitizer/lsan/lsan_posix.cpp:51 #5 0x7fa790d013fd in __lsan_thread_start_func ../../../../src/libsanitizer/lsan/lsan_interceptors.cpp:440 #6 0x7fa790adc3eb in start_thread nptl/pthread_create.c:444 #7 0x7fa790b5ca5b in clone3 ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81 This is not a leak in our code, but appears to be a race between one thread calling exit() while another one is in LSan's stack setup code. You can reproduce it easily by running t0003 or t5309 with --stress (these trigger it because of the threading in git-grep and index-pack respectively). This may be a bug in LSan, but regardless of whether it is eventually fixed, it is useful to work around it so that we stop seeing these false positives. We can recognize it by the mention of the sanitizer functions in the DEDUP_TOKEN line. With this patch, the scripts mentioned above should run with --stress indefinitely. Signed-off-by: Jeff King <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>
derrickstolee
pushed a commit
that referenced
this pull request
Feb 22, 2025
When trying to create a Unix socket in a path that exceeds the maximum socket name length we try to first change the directory into the parent folder before creating the socket to reduce the length of the name. When this fails we error out of `unix_sockaddr_init()` with an error code, which indicates to the caller that the context has not been initialized. Consequently, they don't release that context. This leads to a memory leak: when we have already populated the context with the original directory that we need to chdir(3p) back into, but then the chdir(3p) into the socket's parent directory fails, then we won't release the original directory's path. The leak is exposed by t0301, but only when running tests in a directory hierarchy whose path is long enough to make the socket name length exceed the maximum socket name length: Direct leak of 129 byte(s) in 1 object(s) allocated from: #0 0x5555555e85c6 in realloc.part.0 lsan_interceptors.cpp.o #1 0x55555590e3d6 in xrealloc ../wrapper.c:140:8 #2 0x5555558c8fc6 in strbuf_grow ../strbuf.c:114:2 #3 0x5555558cacab in strbuf_getcwd ../strbuf.c:605:3 #4 0x555555923ff6 in unix_sockaddr_init ../unix-socket.c:65:7 #5 0x555555923e42 in unix_stream_connect ../unix-socket.c:84:6 #6 0x55555562a984 in send_request ../builtin/credential-cache.c:46:11 #7 0x55555562a89e in do_cache ../builtin/credential-cache.c:108:6 #8 0x55555562a655 in cmd_credential_cache ../builtin/credential-cache.c:178:3 #9 0x555555700547 in run_builtin ../git.c:480:11 #10 0x5555556ff0e0 in handle_builtin ../git.c:740:9 #11 0x5555556ffee8 in run_argv ../git.c:807:4 #12 0x5555556fee6b in cmd_main ../git.c:947:19 #13 0x55555593f689 in main ../common-main.c:64:11 #14 0x7ffff7a2a1fb in __libc_start_call_main (/nix/store/h7zcxabfxa7v5xdna45y2hplj31ncf8a-glibc-2.40-36/lib/libc.so.6+0x2a1fb) (BuildId: 0a855678aa0cb573cecbb2bcc73ab8239ec472d0) #15 0x7ffff7a2a2b8 in __libc_start_main@GLIBC_2.2.5 (/nix/store/h7zcxabfxa7v5xdna45y2hplj31ncf8a-glibc-2.40-36/lib/libc.so.6+0x2a2b8) (BuildId: 0a855678aa0cb573cecbb2bcc73ab8239ec472d0) #16 0x5555555ad1d4 in _start (git+0x591d4) DEDUP_TOKEN: ___interceptor_realloc.part.0--xrealloc--strbuf_grow--strbuf_getcwd--unix_sockaddr_init--unix_stream_connect--send_request--do_cache--cmd_credential_cache--run_builtin--handle_builtin--run_argv--cmd_main--main--__libc_start_call_main--__libc_start_main@GLIBC_2.2.5--_start SUMMARY: LeakSanitizer: 129 byte(s) leaked in 1 allocation(s). Fix this leak. Signed-off-by: Patrick Steinhardt <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>
derrickstolee
pushed a commit
that referenced
this pull request
Feb 22, 2025
We don't free the result of `remote_default_branch()`, leading to a memory leak. This leak is exposed by t9211, but only when run with Meson with the `-Db_sanitize=leak` option: Direct leak of 5 byte(s) in 1 object(s) allocated from: #0 0x5555555cfb93 in malloc (scalar+0x7bb93) #1 0x5555556b05c2 in do_xmalloc ../wrapper.c:55:8 #2 0x5555556b06c4 in do_xmallocz ../wrapper.c:89:8 #3 0x5555556b0656 in xmallocz ../wrapper.c:97:9 #4 0x5555556b0728 in xmemdupz ../wrapper.c:113:16 #5 0x5555556b07a7 in xstrndup ../wrapper.c:119:9 #6 0x5555555d3a4b in remote_default_branch ../scalar.c:338:14 #7 0x5555555d20e6 in cmd_clone ../scalar.c:493:28 #8 0x5555555d196b in cmd_main ../scalar.c:992:14 #9 0x5555557c4059 in main ../common-main.c:64:11 #10 0x7ffff7a2a1fb in __libc_start_call_main (/nix/store/h7zcxabfxa7v5xdna45y2hplj31ncf8a-glibc-2.40-36/lib/libc.so.6+0x2a1fb) (BuildId: 0a855678aa0cb573cecbb2bcc73ab8239ec472d0) #11 0x7ffff7a2a2b8 in __libc_start_main@GLIBC_2.2.5 (/nix/store/h7zcxabfxa7v5xdna45y2hplj31ncf8a-glibc-2.40-36/lib/libc.so.6+0x2a2b8) (BuildId: 0a855678aa0cb573cecbb2bcc73ab8239ec472d0) #12 0x555555592054 in _start (scalar+0x3e054) DEDUP_TOKEN: __interceptor_malloc--do_xmalloc--do_xmallocz--xmallocz--xmemdupz--xstrndup--remote_default_branch--cmd_clone--cmd_main--main--__libc_start_call_main--__libc_start_main@GLIBC_2.2.5--_start SUMMARY: LeakSanitizer: 5 byte(s) leaked in 1 allocation(s). As the `branch` variable may contain a string constant obtained from parsing command line arguments we cannot free the leaking variable directly. Instead, introduce a new `branch_to_free` variable that only ever gets assigned the allocated string and free that one to plug the leak. It is unclear why the leak isn't flagged when running the test via our Makefile. Signed-off-by: Patrick Steinhardt <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>
derrickstolee
pushed a commit
that referenced
this pull request
Mar 9, 2025
Submodule merges are, in general, similar to other merges based on oid three-way-merge. When a conflict happens, however, Git has two special cases (introduced in 68d03e4) on handling the conflict before yielding it to the user. From the merge-ort and merge-recursive sources: - "Case #1: a is contained in b or vice versa": both strategies try to perform a fast-forward in the submodules if the commit referred by the conflicted submodule is descendant of another; - "Case #2: There are one or more merges that contain a and b in the submodule. If there is only one, then present it as a suggestion to the user, but leave it marked unmerged so the user needs to confirm the resolution." Add a small paragraph on merge-strategies.adoc describing this behavior. Helped-by: Junio C Hamano <[email protected]> Helped-by: Elijah Newren <[email protected]> Signed-off-by: Lucas Seiki Oshiro <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>
derrickstolee
pushed a commit
that referenced
this pull request
Apr 2, 2025
Signed-off-by: Junio C Hamano <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This RFC includes a new way to index the objects in multiple packs
using one file, called the multi-pack index (MIDX).
The commits are split into parts as follows:
[01] - A full design document.
[02] - The full file format for MIDX files.
[03] - Creation of core.midx config setting.
[04-12] - Creation of "midx" builtin that writes, reads, and deletes
MIDX files.
[13-18] - Consume MIDX files for abbreviations and object loads.
The main goals of this RFC are:
Determine interest in this feature.
Find other use cases for the MIDX feature.
Design a proper command-line interface for constructing and checking
MIDX files. The current "midx" builtin is probably inadequate.
Determine what additional changes are needed before the feature can
be merged. Specifically, I'm interested in the interactions with
repack and fsck. The current patch also does not update the MIDX on
a fetch (which adds a packfile) but would be valuable. Whenever
possible, I tried to leave out features that could be added in a
later patch.
Consider splitting this patch into multiple patches, such as:
i. The MIDX design document.
ii. The command-line interface for building and reading MIDX files.
iii. Integrations with abbreviations and object lookups.
Please do not send any style nits to this patch, as I expect the code to
change dramatically before we consider merging.
I created three copies of the Linux repo with 1, 24, and 127 packfiles
each using 'git repack -adfF --max-pack-size=[64m|16m]'. These copies
gave significant performance improvements on the following comand:
Num Packs | Before MIDX | After MIDX | Rel % | 1 pack %
----------+-------------+------------+--------+----------
1 | 35.64 s | 35.28 s | -1.0% | -1.0%
24 | 90.81 s | 40.06 s | -55.9% | +12.4%
127 | 257.97 s | 42.25 s | -83.6% | +18.6%
The last column is the relative difference between the MIDX-enabled repo
and the single-pack repo. The goal of the MIDX feature is to present the
ODB as if it was fully repacked, so there is still room for improvement.
Changing the command to
has no observable difference (sub 1% change in all cases). This is likely
due to the repack I used putting commits and trees in a small number of
packfiles so the MRU cache workes very well. On more naturally-created
lists of packfiles, there can be up to 20% improvement on this command.
We are using a version of this patch with an upcoming release of GVFS.
This feature is particularly important in that space since GVFS performs
a "prefetch" step that downloads a pack of commits and trees on a daily
basis. These packfiles are placed in an alternate that is shared by all
enlistments. Some users have 150+ packfiles and the MRU misses and
abbreviation computations are significant. Now, GVFS manages the MIDX file
after adding new prefetch packfiles using the following command: