Skip to content

Clean up before 2.20.0 rc0 #1938

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 520 commits into from
Nov 20, 2018
Merged

Conversation

dscho
Copy link
Member

@dscho dscho commented Nov 19, 2018

This was a bit painful: the branch thicket was not exactly in a pristine state, and with this PR, a merging-rebase was performed onto the base of the previous merging-rebase, i.e. not changing much overall, but cleaning up obsolete branches, reordering a couple other branches (e.g. to avoid merge conflicts), etc.

This is intended to be the state from which the merging-rebase onto v2.20.0-rc0 is performed.

dscho and others added 17 commits November 19, 2018 18:25
This speeds up the tests by a bit on Windows, where running Unix shell
scripts (and spawning processes) is not exactly a cheap operation.

Signed-off-by: Johannes Schindelin <[email protected]>
The name is hard-coded to reflect that we use Travis CI for continuous
testing.

In the next commits, we will extend this to be able use Azure DevOps,
too.

So let's adjust the name to make it more generic.

Signed-off-by: Johannes Schindelin <[email protected]>
The upcoming patches will allow building git.git via Azure Pipelines
(i.e. Azure DevOps' Continuous Integration), where variable names and
URLs look a bit different than in Travis CI.

Signed-off-by: Johannes Schindelin <[email protected]>
In the next commit, we want to teach Git's test suite to optionally
output test results in JUnit-style .xml files. These files contain
information about the time spent. So we need a way to measure time.

While we could use `date +%s` for that, this will give us only seconds,
i.e. very coarse-grained timings.

GNU `date` supports `date +%s.%N` (i.e. nanosecond-precision output),
but there is no equivalent in BSD `date` (read: on macOS, we would not
be able to obtain precise timings).

So let's introduce `test-tool date getnanos`, with an optional start
time, that outputs preciser values.

Granted, it is a bit pointless to try measuring times accurately in
shell scripts, certainly to nanosecond precision. But it is better than
second-granularity.

Signed-off-by: Johannes Schindelin <[email protected]>
This will come in handy when publishing the results of Git's test suite
during an automated Azure DevOps run.

Signed-off-by: Johannes Schindelin <[email protected]>
This patch introduces a conditional arm that defines some environment
variables and a function that displays the URL given the job id (to
identify previous runs for known-good trees).

Signed-off-by: Johannes Schindelin <[email protected]>
This commit adds an azure-pipelines.yml file which is Azure DevOps'
equivalent to Travis CI's .travis.yml.

To make things a bit easier to understand, we refrain from using the
`matrix` feature here because (while it is powerful) it can be a bit
confusing to users who are not familiar with CI setups. Therefore, we
use a separate phase even for similar configurations (such as GCC vs
Clang on Linux, GCC vs Clang on macOS).

Also, we make use of the shiny new feature we just introduced where the
test suite can output JUnit-style .xml files. This information is made
available in a nice UI that allows the viewer to filter by phase and/or
test number, and to see trends such as: number of (failing) tests, time
spent running the test suite, etc.

Signed-off-by: Johannes Schindelin <[email protected]>
When running the p4 daemon or `git daemon`, we want to kill it at the
end of the test script.

So far, we do this "manually".

However, in the next few commits we want to teach the test suite to
optionally re-run scripts with different options, therefore we will have
to have a consistent way to stop daemons.

Let's introduce `test_atexit`, which is loosely modeled after
`test_when_finished` (but has a broader scope: rather than running the
commands after the current test case, run them when the test script
finishes, and also run them when the `--immediate` option is in effect).

Signed-off-by: Johannes Schindelin <[email protected]>
This makes use of the just-introduced consistent way to specify that a
long-running process needs to be terminated at the end of a test script
run.

Signed-off-by: Johannes Schindelin <[email protected]>
This should be more reliable than the current method, and prepares the
test suite for a consistent way to clean up before re-running the tests
with different options.

Signed-off-by: Johannes Schindelin <[email protected]>
The JUnit XML format lends itself to be presented in a powerful UI,
where you can drill down to the information you are interested in very
quickly.

For test failures, this usually means that you want to see the detailed
trace of the failing tests.

With Travis CI, we passed the `--verbose-log` option to get those
traces. However, that seems excessive, as we do not need/use the logs in
almost all of those cases: only when a test fails do we have a way to
include the trace.

So let's do something different when using Azure DevOps: let's run all
the tests with `--quiet` first, and only if a failure is encountered,
try to trace the commands as they are executed.

Of course, we cannot turn on `--verbose-log` after the fact. So let's
just re-run the test with all the same options, adding `--verbose-log`.
And then munging the output file into the JUnit XML on the fly.

Note: there is an off chance that re-running the test in verbose mode
"fixes" the failures (and this does happen from time to time!). That is
a possibility we should be able to live with. Ideally, we would label
this as "Passed upon rerun", and Azure Pipelines even know about that
outcome, but it is not available when using the JUnit XML format for
now:
https://github.com/Microsoft/azure-pipelines-agent/blob/master/src/Agent.Worker/TestResults/JunitResultReader.cs

Signed-off-by: Johannes Schindelin <[email protected]>
Sometimes, failures in a test case are actually caused by issues in
earlier test cases.

To make it easier to see those issues, let's attach the output from
before the failing test case (i.e. stdout/stderr since the previous
failing test case, or the start of the test script). This will be
visible in the "Attachments" of the details of the failed test.

Signed-off-by: Johannes Schindelin <[email protected]>
Just like so many other OSS projects, we now also have a build badge.

Signed-off-by: Johannes Schindelin <[email protected]>
When building a PR, TRAVIS_BRANCH refers to the *target branch*.
Therefore, if a PR targets `master`, and `master` happened to be tagged,
we skipped the build by mistake.

Fix this by using TRAVIS_PULL_REQUEST_BRANCH (i.e. the *source branch*)
when available, falling back to TRAVIS_BRANCH (i.e. for CI builds, also
known as "push builds").

Signed-off-by: Johannes Schindelin <[email protected]>
It is better to state clearly expectations and intentions than to assume
quietly that everybody agrees.

This Code of Conduct is the Open Code of Conduct as per
http://todogroup.org/opencodeofconduct/ (the only modifications are the
adjustments to reflect that there is no "response team" in addition to the
Git for Windows maintainer, and the addition of the link to the Open Code
of Conduct itself).

[Completely revamped, based on the Covenant 1.4 by Brendan Forster]

Signed-off-by: Johannes Schindelin <[email protected]>
Getting started contributing to Git can be difficult on a Windows
machine. CONTRIBUTING.md contains a guide to getting started, including
detailed steps for setting up build tools, running tests, and
submitting patches to upstream.

[includes an example by Pratik Karki how to submit v2, v3, v4, etc.]

Signed-off-by: Derrick Stolee <[email protected]>
Includes touch-ups by Philip Oakley.

Signed-off-by: Johannes Schindelin <[email protected]>
@dscho dscho requested a review from derrickstolee November 19, 2018 21:04
@dscho
Copy link
Member Author

dscho commented Nov 19, 2018

Apart from the changes of #1936, this PR should not have any functional changes (the #ifdef _MSC_VER code was not actually necessary anymore, as we did away with the distinction between the environment handling of the MinGW and the MSC builds).

peff and others added 11 commits November 19, 2018 22:19
When writing a bundle to a file, the bundle code actually creates
"your.bundle.lock" using our lockfile interface. We feed that output
descriptor to a child git-pack-objects via run-command, which has the
quirk that it closes the output descriptor in the parent.

To avoid confusing the lockfile code (which still thinks the descriptor
is valid), we dup() it, and operate on the duplicate.

However, this has a confusing side effect: after the dup() but before we
call pack-objects, we have _two_ descriptors open to the lockfile. If we
call die() during that time, the lockfile code will try to clean up the
partially-written file. It knows to close() the file before unlinking,
since on some platforms (i.e., Windows) the open file would block the
deletion. But it doesn't know about the duplicate descriptor. On
Windows, triggering an error at the right part of the code will result
in the cleanup failing and the lockfile being left in the filesystem.

We can solve this by moving the dup() much closer to start_command(),
shrinking the window in which we have the second descriptor open. It's
easy to place this in such a way that no die() is possible. We could
still die due to a signal in the exact wrong moment, but we already
tolerate races there (e.g., a signal could come before we manage to put
the file on the cleanup list in the first place).

As a bonus, this shields create_bundle() itself from the duplicate-fd
trick, and we can simplify its error handling (note that the lock
rollback now happens unconditionally, but that's OK; it's a noop if we
didn't open the lock in the first place).

The included test uses an empty bundle to cause a failure at the right
spot in the code, because that's easy to trigger (the other likely
errors are write() problems like ENOSPC).  Note that it would already
pass on non-Windows systems (because they are happy to unlink an
already-open file).

Based-on-a-patch-by: Gaël Lhez <[email protected]>
Signed-off-by: Jeff King <[email protected]>
Signed-off-by: Johannes Schindelin <[email protected]>
This was pull request git-for-windows#797 from glhez/master

`git bundle create <bundle>` leaks handle the revlist is empty.

Signed-off-by: Johannes Schindelin <[email protected]>
This was pull request git-for-windows#1003 from shoelzer/master

poll: Use GetTickCount64 to avoid wraparound issues
Signed-off-by: Johannes Schindelin <[email protected]>
The fix we introduced in Git for Windows will be made obsolete by a more
general fix that has been already accepted into upstream Git's `next`
branch.

But we still can introduce a regression test that verifies that this bug
will be caught very quickly, if reintroduced.

Signed-off-by: Johannes Schindelin <[email protected]>
This fixes a bug where a .git directory at the root of a network share
(e.g. \\MYSERVER\sharedfolder\.git) was not handled correctly.

Signed-off-by: Johannes Schindelin <[email protected]>
…er/register_rename_src

diffcore-rename: speed up register_rename_src
This topic branch allows us to specify absolute paths without the drive
prefix e.g. when cloning.

Example:

	C:\Users\me> git clone https://github.com/git/git \upstream-git

This will clone into a new directory C:\upstream-git, in line with how
Windows interprets absolute paths.

Signed-off-by: Johannes Schindelin <[email protected]>
Due to a quirk in Git's method to spawn git-upload-pack, there is a
problem when passing paths with backslashes in them: Git will force the
command-line through the shell, which has different quoting semantics in
Git for Windows (being an MSYS2 program) than regular Win32 executables
such as git.exe itself.

The symptom is that the first of the two backslashes in UNC paths of the
form \\myserver\folder\repository.git is *stripped off*.

Document this bug by introducing a test case.

Signed-off-by: Johannes Schindelin <[email protected]>
The MSYS2 runtime does its best to emulate the command-line wildcard
expansion and de-quoting which would be performed by the calling Unix
shell on Unix systems.

Those Unix shell quoting rules differ from the quoting rules applying to
Windows' cmd and Powershell, making it a little awkward to quote
command-line parameters properly when spawning other processes.

In particular, git.exe passes arguments to subprocesses that are *not*
intended to be interpreted as wildcards, and if they contain
backslashes, those are not to be interpreted as escape characters, e.g.
when passing Windows paths.

Note: this is only a problem when calling MSYS2 executables, not when
calling MINGW executables such as git.exe. However, we do call MSYS2
executables frequently, most notably when setting the use_shell flag in
the child_process structure.

There is no elegant way to determine whether the .exe file to be
executed is an MSYS2 program or a MINGW one. But since the use case of
passing a command line through the shell is so prevalent, we need to
work around this issue at least when executing sh.exe.

Let's introduce an ugly, hard-coded test whether argv[0] is "sh", and
whether it refers to the MSYS2 Bash, to determine whether we need to
quote the arguments differently than usual.

That still does not fix the issue completely, but at least it is
something.

Incidentally, this also fixes the problem where `git clone \\server\repo`
failed due to incorrect handling of the backslashes when handing the path
to the git-upload-pack process.

We need to take care to quote not only whitespace, but also curly
brackets. As aliases frequently go through the MSYS2 Bash, and
as aliases frequently get parameters such as HEAD@{yesterday}, let's
make sure that this does not regress by adding a test case for that.

Helped-by: Kim Gybels <[email protected]>
Signed-off-by: Johannes Schindelin <[email protected]>
dscho and others added 24 commits November 19, 2018 23:30
This topic branch brings slightly experimental changes supporting Git
for Windows to use BusyBox-w32 to execute its shell scripts as well as
its test suite.

The test suite can be run by installing the test artifacts into a MinGit
that has busybox.exe (and using Git for Windows' SDK's Perl for now, as
the test suite requires Perl even when NO_PERL is set, go figure) by
using the `install-mingit-test-artifacts` Makefile target with the
DESTDIR variable pointing to the top-level directory of the MinGit
installation.

To facilitate running the test suite (without having `make` available,
as `make.exe` is not part of MinGit), this branch brings an experimental
patch to the `test-run-command` helper to run Git's test suite. It is
still very experimental, though: in this developer's tests it seemed
that the `poll()` emulation required for `run_parallel_processes()` to
work sometimes hiccups on Windows, causing infinite "hangs". It is also
possible that BusyBox itself has problems writing to the pipes opened by
`test-run-command` (and merging this branch will help investigate
further). Caveat emptor.

Signed-off-by: Johannes Schindelin <[email protected]>
This fixes the issue identified in

	git-for-windows#1498

where Git would not fall back to reading credentials from a Win32
Console when the credentials could not be read from the terminal via the
Bash hack (that is necessary to support running in a MinTTY).

Tested in a Powershell window.

Signed-off-by: Johannes Schindelin <[email protected]>
Add a README.md for GitHub goodness.

Signed-off-by: Johannes Schindelin <[email protected]>
To avoid having to play tricks as in earlier rounds, we bit the sour
apple and rebased the `builtin-stash-rebase-v3` branch thicket onto the
commit starting Git for Windows' merging-rebase.

(The merging-rebase pulls in the previous branch thicket via a "fake
merge", i.e. a merge commit that does not actually apply any changes
from the merged commit history. This has the unfortunate side effect of
confusing `merge` into thinking that any branch that was merged into an
earlier round does not need to be merged again.)

Signed-off-by: Johannes Schindelin <[email protected]>
Use FindFirstFileExW with FindExInfoBasic to avoid forcing NTFS to look up
the short name.  Also switch to a larger (64K vs 4K) buffer using
FIND_FIRST_EX_LARGE_FETCH to minimize round trips to the kernel.

In a repo with ~200K files, this drops warm cache status times from 3.19
seconds to 2.67 seconds for a 16% savings.

Signed-off-by: Ben Peart <[email protected]>
Fix t0001 when the current working directory differs in case from the canonical form
At the end of the status command, disable and free the fscache so that we
don't leak the memory and so that we can dump the fscache statistics.

Signed-off-by: Ben Peart <[email protected]>
Add support to fscache to enable running the entire test suite with the
fscache enabled.

Signed-off-by: Ben Peart <[email protected]>
…e don't

leak the memory and so that we can dump the fscache statistics.

Signed-off-by: Ben Peart <[email protected]>
Add tracing around initializing and discarding mempools. In discard report
on the amount of memory unused in the current block to help tune setting
the initial_size.

Signed-off-by: Ben Peart <[email protected]>
Update enable_fscache() to take an optional initial size parameter which is
used to initialize the hashmap so that it can avoid having to rehash as
additional entries are added.

Add a separate disable_fscache() macro to make the code clearer and easier
to read.

Signed-off-by: Ben Peart <[email protected]>
The threading model for fscache has been to have a single, global cache.
This puts requirements on it to be thread safe so that callers like
preload-index can call it from multiple threads.  This was implemented
with a single mutex and completion events which introduces contention
between the calling threads.

Simplify the threading model by making fscache thread specific.  This allows
us to remove the global mutex and synchronization events entirely and instead
associate a fscache with every thread that requests one. This works well with
the current multi-threading which divides the cache entries into blocks with
a separate thread processing each block.

At the end of each worker thread, if there is a fscache on the primary
thread, merge the cached results from the worker into the primary thread
cache. This enables us to reuse the cache later especially when scanning for
untracked files.

In testing, this reduced the time spent in preload_index() by about 25% and
also reduced the CPU utilization significantly.  On a repo with ~200K files,
it reduced overall status times by ~12%.

Signed-off-by: Ben Peart <[email protected]>
…-gfw

fscache: use FindFirstFileExW to avoid retrieving the short name
Track fscache hits and misses for lstat and opendir requests.  Reporting of
statistics is done when the cache is disabled for the last time and freed
and is only reported if GIT_TRACE_FSCACHE is set.

Sample output is:

11:33:11.836428 compat/win32/fscache.c:433 fscache: lstat 3775, opendir 263, total requests/misses 4052/269

Signed-off-by: Ben Peart <[email protected]>
Now that the fscache is single threaded, take advantage of the mem_pool as
the allocator to significantly reduce the cost of allocations and frees.

With the reduced cost of free, in future patches, we can start freeing the
fscache at the end of commands instead of just leaking it.

Signed-off-by: Ben Peart <[email protected]>
…ter-status-gfw

status: disable and free fscache at the end of the status command
…e-gfw

fscache: add GIT_TEST_FSCACHE support
…ter-add-gfw

At the end of the add command, disable and free the fscache
…ics-gfw

fscache: add fscache hit statistics
When debugging Git, the criss-cross spawning of processes can make
things quite a bit difficult, especially when a Unix shell script is
thrown in the mix that calls a `git.exe` that then segfaults.

To help debugging such things, we introduce the `open_in_gdb()` function
which can be called at a code location where the segfault happens (or as
close as one can get); This will open a new MinTTY window with a GDB
that already attached to the current process.

Inspired by Derrick Stolee.

Signed-off-by: Johannes Schindelin <[email protected]>
This brings substantial wins in performance because the FSCache is now
per-thread, being merged to the primary thread only at the end, so we do
not have to lock (except while merging).

Signed-off-by: Johannes Schindelin <[email protected]>
To support IPv6, Git provided fall back functions for Windows versions that
did not support IPv6. However, as Git dropped support for Windows XP and
prior, those functions are not needed anymore.

Removed those fallbacks by reverting commit[1] and using the functions
directly (without 'ipv6_' prefix).

[1] fe3b2b7.

Signed-off-by: tanushree27 <[email protected]>
Add a helper function to start GDB that was already attached to the current process
…fallback

[Outreachy] Removed ipv6 fallback
@dscho dscho force-pushed the clean-up-before-2.20.0-rc0 branch from 8ce9615 to 114a662 Compare November 19, 2018 22:30
@dscho dscho added this to the v2.19.1(2) milestone Nov 20, 2018
@dscho dscho merged commit 114a662 into git-for-windows:master Nov 20, 2018
@dscho dscho deleted the clean-up-before-2.20.0-rc0 branch November 20, 2018 10:24
@dscho
Copy link
Member Author

dscho commented Nov 20, 2018

This change looks reasonable.

@derrickstolee thanks!

I'm assuming that the commit graph you have manufactured here makes your life easier in the 2.20.0 update.

It totally did.

Best of luck!

Thank you again!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.