Skip to content

repo, cmd: DROP UNEEDED Win path for chcwd & check for '~' homedir #529

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 3 additions & 12 deletions git/cmd.py
Original file line number Diff line number Diff line change
Expand Up @@ -42,12 +42,12 @@
)


execute_kwargs = set(('istream', 'with_keep_cwd', 'with_extended_output',
execute_kwargs = set(('istream', 'with_extended_output',
'with_exceptions', 'as_process', 'stdout_as_string',
'output_stream', 'with_stdout', 'kill_after_timeout',
'universal_newlines', 'shell'))

log = logging.getLogger('git.cmd')
log = logging.getLogger(__name__)
log.addHandler(logging.NullHandler())

__all__ = ('Git',)
Expand Down Expand Up @@ -413,7 +413,6 @@ def version_info(self):

def execute(self, command,
istream=None,
with_keep_cwd=False,
with_extended_output=False,
with_exceptions=True,
as_process=False,
Expand All @@ -436,11 +435,6 @@ def execute(self, command,
:param istream:
Standard input filehandle passed to subprocess.Popen.

:param with_keep_cwd:
Whether to use the current working directory from os.getcwd().
The cmd otherwise uses its own working_dir that it has been initialized
with if possible.

:param with_extended_output:
Whether to return a (status, stdout, stderr) tuple.

Expand Down Expand Up @@ -513,10 +507,7 @@ def execute(self, command,
log.info(' '.join(command))

# Allow the user to have the command executed in their working dir.
if with_keep_cwd or self._working_dir is None:
cwd = os.getcwd()
else:
cwd = self._working_dir
cwd = self._working_dir or os.getcwd()

# Start the process
env = os.environ.copy()
Expand Down
52 changes: 12 additions & 40 deletions git/repo/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -62,8 +62,11 @@
import os
import sys
import re
import logging
from collections import namedtuple

log = logging.getLogger(__name__)

DefaultDBType = GitCmdObjectDB
if sys.version_info[:2] < (2, 5): # python 2.4 compatiblity
DefaultDBType = GitCmdObjectDB
Expand Down Expand Up @@ -871,46 +874,15 @@ def _clone(cls, git, url, path, odb_default_type, progress, **kwargs):
if progress is not None:
progress = to_progress_instance(progress)

# special handling for windows for path at which the clone should be
# created.
# tilde '~' will be expanded to the HOME no matter where the ~ occours. Hence
# we at least give a proper error instead of letting git fail
prev_cwd = None
prev_path = None
odbt = kwargs.pop('odbt', odb_default_type)
if is_win:
if '~' in path:
raise OSError("Git cannot handle the ~ character in path %r correctly" % path)

# on windows, git will think paths like c: are relative and prepend the
# current working dir ( before it fails ). We temporarily adjust the working
# dir to make this actually work
match = re.match("(\w:[/\\\])(.*)", path)
if match:
prev_cwd = os.getcwd()
prev_path = path
drive, rest_of_path = match.groups()
os.chdir(drive)
path = rest_of_path
kwargs['with_keep_cwd'] = True
# END cwd preparation
# END windows handling

try:
proc = git.clone(url, path, with_extended_output=True, as_process=True,
v=True, **add_progress(kwargs, git, progress))
if progress:
handle_process_output(proc, None, progress.new_message_handler(), finalize_process)
else:
(stdout, stderr) = proc.communicate() # FIXME: Will block of outputs are big!
finalize_process(proc, stderr=stderr)
# end handle progress
finally:
if prev_cwd is not None:
os.chdir(prev_cwd)
path = prev_path
# END reset previous working dir
# END bad windows handling
proc = git.clone(url, path, with_extended_output=True, as_process=True,
v=True, **add_progress(kwargs, git, progress))
if progress:
handle_process_output(proc, None, progress.new_message_handler(), finalize_process)
else:
(stdout, stderr) = proc.communicate() # FIXME: Will block of outputs are big!
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's quite amazing that communicate() actually cannot be trusted or is simply not implemented correctly :/.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, you are quite right - communicate() CAN be trusted.

I did a small experiment of reading a Gbyte, both in PY27 and PY35, and they do not block:

>>> import subprocess as sb
>>> proc=sb.Popen('dd if=/dev/zero bs=1024 count=1000000', bufsize=0, stdin=sb.PIPE, stdout=sb.PIPE, stderr=sb.PIPE)
>>> a,b=proc.communicate()

I don't remember when I got this mistrust for communicate(), maybe when I had tried PY33, or maybe when epxerimenting with interactive processes (like the git.cmd.Git.cat-file persistent command); there the comminicate() is not suitable because it will start consuming streams till the death of the process, but the process is awaiting for more input from python-side before ending, so it's a deadlock.

One or two changes in my windows fixes were under this wrong assumption that communicate() should not be trusted. But mostly I replaced code accessing directly proc.stderr/stdout from the main thread - this is definitely freeze-prone code.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A nice experiment ! It doesn't seem to output both on stdout and stderr though, which might be causing the problem we see. The implementation of communicate looks asynchronous, but who knows what's hidden in the details.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even when stdout/stderr is interplexed, communicate() does not block (at least in PY3):

>>> import subprocess as sb
>>> proc = sb.Popen(['python','-c',"import sys\nfor i in range(1000000):\n    print('a'*1024); print('b'*1024, file=sys.stderr)"], stdin=sb.PIPE, stdout=sb.PIPE, stderr=sb.PIPE)
>>> %time a,b=proc.communicate()
Wall time: 48.3 s
>>> len(a), len(b)
(1026000000, 1026000000)

For some "unicode" reason cannot run the above experiment in PY2.

log.debug("Cmd(%s)'s unused stdout: %s", getattr(proc, 'args', ''), stdout)
finalize_process(proc, stderr=stderr)

# our git command could have a different working dir than our actual
# environment, hence we prepend its working dir if required
Expand All @@ -922,7 +894,7 @@ def _clone(cls, git, url, path, odb_default_type, progress, **kwargs):
# that contains the remote from which we were clones, git stops liking it
# as it will escape the backslashes. Hence we undo the escaping just to be
# sure
repo = cls(os.path.abspath(path), odbt=odbt)
repo = cls(path, odbt=odbt)
if repo.remotes:
with repo.remotes[0].config_writer as writer:
writer.set_value('url', repo.remotes[0].url.replace("\\\\", "\\").replace("\\", "/"))
Expand Down