Skip to content

Git hanging, git install hanging #1855

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
briankitt opened this issue Sep 28, 2018 · 20 comments
Closed

Git hanging, git install hanging #1855

briankitt opened this issue Sep 28, 2018 · 20 comments

Comments

@briankitt
Copy link

Windows 10 Pro 1709. Running on a corporate network that is heavily monitored and protected. Really no realistic possibility of a virus/malware/trojan.

I had been running git 2.15.1.2 (64 bit) for quite some time, and suddenly NPM started hanging. I traced it to 'git -version' that NPM was doing. If I run 'git -version' on the command line, it hangs, and never ends. No known software changes between the time it worked, and the time it did not work, other than possibly windows updates.

So I did the windows defacto action. I uninstalled Git, and installed v 2.19.0. I can't get an install done, it keeps hanging at 100%. I did a /log, and the log appears fine (attached). But it won't complete. It doesn't get the path updated. So I cancelled the install, and added the path manually. 'git -version' still hangs.

I uninstalled, removed every trace, rebooted, installed 2.19.0 again. Same exact behavior.

I uninstalled, removed every trace, rebooted, installed 1.15.1.2 again. Same exact behavior.

I'm convinced this is either a windows fix, or some other software that is causing git to not work. I can't even do a 'git -version', HOWEVER if I do a 'get --version', it does come back and give me an error message letting me know that --version is not a valid command.

Any ideas where to go on this? I've used both procmon and process explorer to try to get information, and I have nothing more to add. I really need git to work, I use it professionally for source control. I do not know why it just stopped working.

Note, even though the log says the install was successful, the path entries were not written, and git still hangs.

git.log

@dscho
Copy link
Member

dscho commented Nov 9, 2018

I am afraid that it is unclear to me, from the available information, what could be the culprit. And since I have no way to reproduce, I have no idea how to investigate either... If I had a way to reproduce, I would probably start instrumenting the code to spew out debug messages, recompile, copy over to the machine that exposes the problem, and take it from there. You could also see whether things change if you set the environment variable GIT_TRACE=1 before running git version.

@dscho dscho added the unclear label Nov 9, 2018
@briankitt
Copy link
Author

briankitt commented Nov 9, 2018 via email

@maxxd
Copy link

maxxd commented Dec 26, 2018

May have the same problem (identical symptom). Running on Windows 1809 17763.195

Today PowerShell took 45 seconds to open and I got an error message "WARNING: posh-git requires Git 1.7.2 or better. You have ."

I had just upgraded to Git for Windows 2.20.1. Tried complete removal and reinstall, same behavior.

Git works from "Git Bash" (specifically, /mingw64/bin/git works, but not /cmd/git.exe) but running git from a Windows Command Prompt, PowerShell, or Cygwin (using git for windows path directly, not the Cygwin version) exhibits the same behavior: git spawns thousands of recursive child processes each with the same command line parameters. If I run "git --version" it spawns a child process with the command line "git --version" which spawns a child, until the processes are exhausted and Windows kills them all and cleans them up (takes about a minute on my system)

Unless I am monitoring with Task Manager/Process Hacker, it appears to just hang.

Edit: I also have another system with Windows 1809 build 17763.195 and most of the same software. Both systems have Git for Windows 2.20.1.windows.1, but it works as expected on one system and not the other. The git.exe is identical on both the working and non-working system.

sha1 91792e1aa7f06f25aa560752cf540a1d1395477d C:\Program Files\Git\cmd\git.exe

@dscho
Copy link
Member

dscho commented Dec 26, 2018

@maxxd do you have any anti-malware running (that may indicate problems with Git)?

@maxxd
Copy link

maxxd commented Dec 26, 2018

@maxxd do you have any anti-malware running (that may indicate problems with Git)?

@dscho I have Kaspersky Internet Security 2019 on both systems, both identical versions (19.0.0.1088(D)), and both updated. I have tried both pausing and entirely quitting Kaspersky. I also checked Kaspersky's event log and Git is listed as fully trusted and fully allowed.

When I connected to the affected system via Remote Desktop to continue diagnosing the problem, git just worked. I could not get it to fail. No "fork bomb", no hanging, no errors. Works with Kaspersky on and off. I logged out and logged in to the physical machine and the behavior resumed.

So I logged in with Remote Desktop, opened PowerShell, and executed git --version and it immediately returned git version 2.20.1.windows.1

Then, without logging out, I went to the machine and unlocked and in the very same PowerShell window attempted to execute it again and got the "fork bomb" behavior. I reliably fail when local, reliably succeed when using Remote Desktop.

I pulled the list of loaded modules and there is one difference when I'm in Remote Desktop. In the RDP/working case, locale.nls is loaded at a different location. locale.nls varies per run, so I don't think that's the issue.

working.txt
broken.txt

@dscho
Copy link
Member

dscho commented Dec 26, 2018

I have no idea what this problem's root cause may be. I cannot reproduce, and the locale.nls difference does not look to me as if it could be the culprit.

@maxxd
Copy link

maxxd commented Dec 26, 2018

I created a new user profile on the affected system. Same behavior: when logged in locally, I get the runaway child processes. When logged in via Remote Desktop, everything works as expected.

Since Windows loads a remote desktop video driver during RDP, I thought the Nvidia shim seemed like it might have some behavior difference between local and RDP. I upgraded the Nvidia driver, but there was no effect. I will see if I can find any other deltas between RDP and local.

I'm not blocked as I have other systems I can use, so I'm going to continue trying to figure this out in case it's something that might bite someone else. I will update if I find out more. Thanks for taking a look.

@maxxd
Copy link

maxxd commented Dec 27, 2018

After further investigation I have discovered that git also works as expected when logged in via the Windows OpenSSH (installed as optional component) on localhost, so another way to determine if this is the 'git hanging' issue:

C:\Users\maxx> git --version
(child process bomb)
C:\Users\maxx> ssh 127.1
maxx@SAKURA C:\Users\maxx> git --version
git version 2.20.1.windows.1

I set up Process Monitor on the git process and found one major delta. In the case where I am logged in locally and git child bombs, the CreateFile call starts from the cmd subdirectory rather than the installation directory and lacks git.exe at the end — later, git uses the (installdir)\cmd\git.exe wrapper and starts the infinite loop of calling itself:

"git.exe","CloseFile","C:\Program Files\Git\cmd\git.exe","SUCCESS",""
"git.exe","CreateFile","C:\Program Files\Git\cmd\mingw64\bin","PATH NOT FOUND","Desired Access: Read Attributes, Disposition: Open, Options: Open Reparse Point, Attributes: n/a, ShareMode: Read, Write, Delete, AllocationSize: n/a"
"git.exe","CreateFile","C:\Program Files\Git\cmd\mingw\bin","PATH NOT FOUND","Desired Access: Read Attributes, Disposition: Open, Options: Open Reparse Point, Attributes: n/a, ShareMode: Read, Write, Delete, AllocationSize: n/a"
"git.exe","CreateFile","C:\Program Files\Git\mingw64\bin","SUCCESS","Desired Access: Read Attributes, Disposition: Open, Options: Open Reparse Point, Attributes: n/a, ShareMode: Read, Write, Delete, AllocationSize: n/a, OpenResult: Opened"
"git.exe","QueryBasicInformationFile","C:\Program Files\Git\mingw64\bin","SUCCESS","CreationTime: 12/26/2018 12:05:55 PM, LastAccessTime: 12/26/2018 12:06:23 PM, LastWriteTime: 12/26/2018 12:06:23 PM, ChangeTime: 12/26/2018 12:06:23 PM, FileAttributes: D"
"git.exe","CloseFile","C:\Program Files\Git\mingw64\bin","SUCCESS",""
"git.exe","CreateFile","C:\Users\mdaymon","SUCCESS","Desired Access: Read Attributes, Disposition: Open, Options: Open Reparse Point, Attributes: n/a, ShareMode: Read, Write, Delete, AllocationSize: n/a, OpenResult: Opened"
...
"git.exe","CreateFile","C:\Program Files\Git\cmd\git.exe","SUCCESS","Desired Access: Read Data/List Directory, Execute/Traverse, Read Attributes, Synchronize, Disposition: Open, Options: Synchronous IO Non-Alert, Non-Directory File, Attributes: N, ShareMode: Read, Delete, AllocationSize: n/a, OpenResult: Opened"

In the case of being logged in via RDP or SSH and git works as expected, the CreateFile call is under the mingw64\bin directory and specifies the git.exe file. The next call is to (installdir)\mingw64\bin\git.exe and successfully executes:

"git.exe","CloseFile","C:\Program Files\Git\cmd\git.exe","SUCCESS",""
"git.exe","CreateFile","C:\Program Files\Git\mingw64\bin\git.exe","SUCCESS","Desired Access: Read Attributes, Disposition: Open, Options: Open For Backup, Open Reparse Point, Attributes: n/a, ShareMode: Read, Write, Delete, AllocationSize: n/a, OpenResult: Opened"
"git.exe","QueryBasicInformationFile","C:\Program Files\Git\mingw64\bin\git.exe","SUCCESS","CreationTime: 12/26/2018 12:05:55 PM, LastAccessTime: 12/26/2018 12:05:55 PM, LastWriteTime: 12/15/2018 9:52:28 AM, ChangeTime: 12/26/2018 12:06:23 PM, FileAttributes: A"
"git.exe","CloseFile","C:\Program Files\Git\mingw64\bin\git.exe","SUCCESS",""
"git.exe","CreateFile","C:\Program Files\Git\mingw64\bin","SUCCESS","Desired Access: Read Attributes, Disposition: Open, Options: Open For Backup, Open Reparse Point, Attributes: n/a, ShareMode: Read, Write, Delete, AllocationSize: n/a, OpenResult: Opened"
...
"git.exe","CreateFile","C:\Program Files\Git\mingw64\bin\git.exe","SUCCESS","Desired Access: Read Data/List Directory, Execute/Traverse, Read Attributes, Synchronize, Disposition: Open, Options: Synchronous IO Non-Alert, Non-Directory File, Attributes: N, ShareMode: Read, Delete, AllocationSize: n/a, OpenResult: Opened"

In both the broken and working states, the path of the initial cmd\git.exe has had the same paths prepended:

C:\Program Files\Git\mingw64\bin;C:\Program Files\Git\usr\bin;

I am investigating git-wrapper.c where this decision appears to be made

@maxxd
Copy link

maxxd commented Dec 28, 2018

I was able to get the git.exe wrapper in a debugger and discovered that L"git.exeb\000\100" is being passed to configure_via_resource on line 562 in the broken/locally logged in case, and L"git.exe" in the working/remote/ssh scenario. The extra data is causing the later checks of "git.exe" to fail and thus cascading into the failure mode of infinite child processes being created.

Looking at the wcsncpy documentation, it says that, "If wfrom does not contain a null wide character in its first size wide characters, then wcsncpy copies just the first size wide characters. In this case no null terminator is written into wto."

From the C++ specification, it states "Thus, if there is no null wide character in the first n wide characters of the array pointed to by s2, the result will not be null-terminated." Stepping through the code, this seems to be the case. The exepath being copied does not include a null within the size I am seeing in the debugger.

@dscho
Copy link
Member

dscho commented Dec 28, 2018

@maxxd I guess that that PathRemoveFileSpec() call does not guarantee that the basename is left untouched anymore.

Could I ask you to apply this diff and see whether it fixes things for you?

diff --git a/mingw-w64-git/git-wrapper.c b/mingw-w64-git/git-wrapper.c
index eaf8de2a..6a1d0f4c 100644
--- a/mingw-w64-git/git-wrapper.c
+++ b/mingw-w64-git/git-wrapper.c
@@ -543,7 +543,7 @@ int main(void)
 		allocate_console = 0, show_console = -1,
 		append_quote_to_cmdline = 0;
 	WCHAR exepath[MAX_PATH], exe[MAX_PATH], top_level_path[MAX_PATH];
-	LPWSTR cmd = NULL, exep = exe, prefix_args = NULL, basename;
+	LPWSTR cmd = NULL, exep = exe, prefix_args = NULL, basename = NULL, p;
 	LPWSTR working_directory = NULL;
 
 	/* Determine MSys2-based Git path. */
@@ -554,11 +554,13 @@ int main(void)
 	/* get the installation location */
 	/* GetModuleFileName(NULL, exepath, MAX_PATH); */
 	find_exe_realpath(exepath, MAX_PATH);
-	if (!PathRemoveFileSpec(exepath)) {
+	for (p = exepath; *p; p++)
+		if ((*p == L'\\' || *p == L'/') && p[1])
+			basename =  p + 1;
+	if (!basename || basename - exepath < 3) {
 		fwprintf(stderr, L"Invalid executable path: %s\n", exepath);
 		ExitProcess(1);
 	}
-	basename = exepath + wcslen(exepath) + 1;
 	if (configure_via_resource(basename, exepath, exep,
 			&prefix_args, &prefix_args_len,
 			&is_git_command, &working_directory,

@maxxd
Copy link

maxxd commented Dec 29, 2018

I tried the patch and, while it no longer spawns child processes, it entirely fails to find any executable and outputs a message that it cannot find git.

Next, using gdb and procmon to verify, it does have the uninitialized memory characters that were showing up in gdb at the end when it attempts to open the directory: For example, procmon shows this attempt with the extra "b" and Han character: \git-sdk-installer-1.0.7-64\usr\src\MINGW-packages\mingw-w64-git\src\git\cmd\git.exeb䀀\mingw64\bin

I was able to get everything working in each test scenario by ensuring a wide null termination character at the end of the copied realexe buffer was in the destination exepath buffer (immediately after line 68 in find_exe_realpath) with this patch:

diff --git a/mingw-w64-git/git-wrapper.c b/mingw-w64-git/git-wrapper.c
index eaf8de2a..83880926 100644
--- a/mingw-w64-git/git-wrapper.c
+++ b/mingw-w64-git/git-wrapper.c
@@ -66,6 +66,7 @@ static void find_exe_realpath(LPWSTR exepath, int count) {
 	} else {
 		size_t offset = wcsncmp(L"\\\\?\\", realexe, 4) ? 0 : 4;
 		wcsncpy(exepath, realexe + offset, dwlen - offset);
+		exepath[dwlen - offset] = L'\0';
 	}
 	CloseHandle(hFile);
 }

@mfriedrich74
Copy link

mfriedrich74 commented Dec 29, 2018 via email

@maxxd
Copy link

maxxd commented Dec 30, 2018

Thanks for the confirmation @mfriedrich74. @dscho I did further investigation into why this only happened when I was logged in locally, and not over RDP or SSH. It turns out an Nvidia dll (nvinitx.dll) being hooked into every process leaves the memory a bit more dirty by the time the exepath buffer is created. Since the buffer is not initialized/zeroed and wcsncpy does not write the null terminator, it was just a matter of chance whether the path landed adjacent to a pre-existing null or not.

This is the buffer prior to being copied into without the Nvidia hook code executing:

(gdb) x/140xh exepath
0x6a7db0:       0x0000  0x0000  0x0000  0x0000  0x0000  0x0000  0x0000  0x0000
0x6a7dc0:       0x0000  0x0000  0x0000  0x0000  0x0000  0x0000  0x0000  0x0000
0x6a7dd0:       0x0000  0x0000  0x0000  0x0000  0x0000  0x0000  0x0000  0x0000
0x6a7de0:       0x0000  0x0000  0x0000  0x0000  0x0000  0x0000  0x0000  0x0000
0x6a7df0:       0x0000  0x0000  0x0000  0x0000  0x0000  0x0000  0x0000  0x0000
0x6a7e00:       0x0000  0x0000  0x0000  0x0000  0x0000  0x0000  0x0000  0x0000
0x6a7e10:       0x0000  0x0000  0x0000  0x0000  0x0000  0x0000  0x0000  0x0000
0x6a7e20:       0x0000  0x0000  0x0000  0x0000  0x0000  0x0000  0x0000  0x0000
0x6a7e30:       0x0000  0x0000  0x0000  0x0000  0x0000  0x0000  0x0000  0x0000
0x6a7e40:       0x0000  0x0000  0x0000  0x0000  0x5cc9  0x49c9  0x7ffb  0x0000
0x6a7e50:       0x0000  0x0000  0x0000  0x0000  0x0000  0x0000  0x0000  0x0000
0x6a7e60:       0x0000  0x0000  0x0000  0x0000  0x0000  0x0000  0x0000  0x0000
0x6a7e70:       0x0000  0x0000  0x0000  0x0000  0x2ffd  0x47fe  0x7ffb  0x0000
0x6a7e80:       0x0000  0x0000  0x0000  0x0000  0x0000  0x0000  0x0000  0x0000
0x6a7e90:       0x0000  0x0000  0x0000  0x0000  0x0000  0x0000  0x0000  0x0000
0x6a7ea0:       0x0000  0x0000  0x0000  0x0000  0x0000  0x0000  0x0000  0x0000
0x6a7eb0:       0x0000  0x0000  0x0000  0x0000  0x0000  0x0000  0x0000  0x0000
0x6a7ec0:       0x0000  0x0000  0x0000  0x0000

This is the exepath buffer when the Nvidia dll hook dll is executed (you can see the 'b' in git.exeb' is the 0x0062 and the Han character is 0x4000 right after on line 0x6a7e80, and then the null that happened to be there already).

(gdb) x/200xh exepath
0x6a7db0:       0x0000  0x0000  0x0000  0x0000  0x0000  0x0000  0x0000  0x0000
0x6a7dc0:       0x0000  0x0000  0x0000  0x0000  0x0000  0x0000  0x0000  0x0000
0x6a7dd0:       0x0005  0x0000  0x0000  0x0000  0xda21  0x49c1  0x7ffb  0x0000
0x6a7de0:       0x0000  0x0000  0x0000  0x0000  0x0000  0x0000  0x0000  0x0000
0x6a7df0:       0xe0c6  0x9fba  0x9497  0x9c9b  0xb7e0  0xbd91  0x948c  0x9bb9
0x6a7e00:       0xbd8c  0x9091  0x918e  0xaf9d  0x9b8b  0x878e  0x92b7  0x919a
0x6a7e10:       0x00d2  0x0000  0x0000  0x0000  0x0000  0x0000  0x0000  0x0000
0x6a7e20:       0x0000  0x0000  0x0000  0x0000  0x0000  0x0000  0x0000  0x0000
0x6a7e30:       0x0000  0x0000  0x0000  0x0000  0x0000  0x0000  0x0000  0x0000
0x6a7e40:       0x0000  0x0000  0x0000  0x0000  0x0000  0x0000  0x0000  0x0000
0x6a7e50:       0x0000  0x0000  0x0000  0x0000  0x0810  0x0000  0x0000  0x0000
0x6a7e60:       0x203a  0x6146  0x6c69  0x6465  0x4920  0x436f  0x6c74  0x6547
0x6a7e70:       0x4374  0x706f  0x6f72  0x5163  0x6575  0x7972  0x6e49  0x6f66
0x6a7e80:       0x002e  0x000c  0x0000  0x0000 *0x0062**0x4000* 0x0000  0x0000

The memory is dirtier than the non-Nvidia init, so caused the issue with the standard C:\Program Files\Git\cmd\git.exe path. Adding 17 characters to the directory name C:\Program Files\Git\cmd12345678901234567\git.exe causes the path to end at 0x6a7e10 which is the next available null that happened to be in the buffer and it worked again without the patch.

However, this patch appears to be the more modern/recommended approach where C11/wcsncpy_s is available. The primary difference is that wcsncpy_s ensures null termination of the dest string, whereas termination has to be done manually with wcsncpy:

diff --git a/mingw-w64-git/git-wrapper.c b/mingw-w64-git/git-wrapper.c
index eaf8de2a..bbbbcef4 100644
--- a/mingw-w64-git/git-wrapper.c
+++ b/mingw-w64-git/git-wrapper.c
@@ -65,7 +65,7 @@ static void find_exe_realpath(LPWSTR exepath, int count) {
 		wcscpy(exepath, mdexe);
 	} else {
 		size_t offset = wcsncmp(L"\\\\?\\", realexe, 4) ? 0 : 4;
-		wcsncpy(exepath, realexe + offset, dwlen - offset);
+		wcsncpy_s(exepath, MAX_PATH, realexe + offset, dwlen - offset);
 	}
 	CloseHandle(hFile);
 }

@mfriedrich74
Copy link

mfriedrich74 commented Dec 30, 2018

In c/cc++ memory is not initialized/zeroed automatically. So it really does not matter what the buffer was before. As far as i can tell from cppreference.com the wcsncpy_s has the same flaw and may not terminate the string. It just protects from buffer overruns.

@mfriedrich74
Copy link

Also the wcsncpy_s return code is not checked which is of type errno_t.

@maxxd
Copy link

maxxd commented Dec 31, 2018

In c/cc++ memory is not initialized/zeroed automatically. So it really does not matter what the buffer was before. As far as i can tell from cppreference.com the wcsncpy_s has the same flaw and may not terminate the string. It just protects from buffer overruns.

Yes. That's exactly the problem, and that is exactly why it does matter. The defect here is that a string is being set without correctly null terminating it.

The buffer underlying the string either needs to be initialized, or the null needs to be appended to mark the end of the string. With the current code, the path is being concatenated to uninitialized memory and the next method is reading up to the next null that happened to be in memory. (Initializing the buffers to zero also resolves the defect)

While wcsncpy_s has additional protections against buffer overruns, it also guarantees null termination. It was proposed by Microsoft as part of their Secure CRT initiative in 2004 and was intended to remove multiple classes of defects, including this one.

The wcsncpy_s function copies not more than slen successive characters (characters that follow a null character are not copied) from the array pointed to by src to the array pointed to by dest. If no null character was copied from src, then dest[slen] is set to a null character.

API Reference safestringlib

Upon successful execution, the destination string is always null-terminated.

strcpy_s, wcscpy_s, _mbscpy_s

wcsncpy, on the other hand, requires manual intervention if the source characters do not include a null terminator:

If count is reached before the entire string src was copied, the resulting wide character array is not null-terminated.

wcsncpy

Which is exactly what's happening in this case. The path C:\Program Files\Git\cmd\git.exe is being copied to the buffer, it is adjacent to an extra two characters that are in the uninitialized buffer, then a null that was also in uninitialized memory.

That said all said, per Field Experience With Annex K — Bounds Checking Interfaces, the _s "secure" functions look like they will be deprecated and removed from the standard. The first patch of simply writing the null termination manually appears to be the current preferable option to ensure correct null termination of the path.

@mfriedrich74
Copy link

That is from the MSDN documentation and applies to the MS compiler runtime. Is this standard and the same for MSYS/... other compilers? I know that MS implements a lot nonstandard things.

@maxxd
Copy link

maxxd commented Dec 31, 2018

Yes, the safe string functions (ISO TR24731, C11 Annex K "Bounds-checking interfaces" ISO/IEC 9899:2011) are available in MinGW: string_s.h Yes, it is part of the C11 standard:

ISO/IEC TR 24731-1:2007 provides alternative functions for the C Library (as defined in ISO/IEC 9899:1999) that promote safer, more secure programming. The functions verify that output buffers are large enough for the intended result, and return a failure indicator if they are not. Optionally, failing functions call a "runtime-constraint handler" to report the error. Data is never written past the end of an array. All string results are null terminated. In addition, the functions in ISO/IEC TR 24731-1:2007 are re-entrant: they never return pointers to static objects owned by the function.

Not producing any unterminated strings is a key goal of this standard:

Rationale for TR 24731 1.1.3 Do not produce unterminated strings If a string lacks the terminating null character, the program may be tricked into accessing storage after the string as legitimate data. This may cause a program to process a string that it should not, which might be a security flaw in itself. It may also cause the program to abort, which might be a denial of service attack.

That said, per the field experience report above (from 2015), "Therefore, we propose that Annex K be either removed from the next revision of the C standard, or deprecated and then removed." it may not be the preferred direction. At the very least, it's a larger decision to be made whether this project wants to use these methods.

So, at least for this issue, appending the null manually is likely the better option.

@dscho
Copy link
Member

dscho commented Jan 2, 2019

Closed via git-for-windows/MINGW-packages#32

@MichaWiedenmann
Copy link

I just hit this issue and would like to express my deep gratitude for debugging/fixing this issue. Thanks for spending your time and effort - you make my world better. Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants