Skip to content

refactor: use run-like syntax #672

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

henryiii
Copy link
Contributor

@henryiii henryiii commented May 13, 2021

This standardizes the "call" and "shell" commands into a single "run" utility. This utility is just a wrapper around subprocess.run that prints the command before running it, and has a better default on Windows due to a Python bug. The goal here is really not to simplify, or to even help out #665 anymore, but to better match the stdlib run syntax to be more recognizable to readers. This is a natural followup to #592 which changed subprocess.check_call(...) to subprocess.run(..., check=True). I think if this was developed from scratch today, it would look more like this syntax vs. the old call.

Mapping:

  • call(...) -> run(..., check=True)
  • shell(...) -> run(..., check=True, shell=True) (on Windows, for now, this is the same as the above line, but highlights that it is intentionally needing/using the shell, while the one above does not except for the mentioned bug)
  • call_with_arch(...) -> run_with_arch(..., check=True)

I did not change the docker self.call, as that's already pretty far from subprocess.run, AFAICT, and therefore benefits less from mimicking it. Though it could be changed.

Not absolutely glued to the idea if others have objections. The other direction would be to try to simply the call command to diverge farther from subprocess.run, say by accepting *args, smarter cwd default, etc. This could then allow pip = partial(call, "python", "-m", "pip"), for example.

@henryiii henryiii force-pushed the henryiii/refactor/run2 branch from 81d30a5 to 9978482 Compare May 13, 2021 22:13
@henryiii henryiii force-pushed the henryiii/refactor/run2 branch from 9978482 to 203bf33 Compare May 13, 2021 22:19
@henryiii henryiii marked this pull request as ready for review May 14, 2021 03:36
@joerick
Copy link
Contributor

joerick commented May 21, 2021

I'm not sure I agree with the motivation behind this change. I think a key motivator for the existing call wrapper is to ensure that we don't accidentally miss check=True on a function call. I understand that matching the stdlib might be a bit more familiar, but check=False is the wrong default for our use-case, IMO.

@henryiii
Copy link
Contributor Author

The original call wrapper was a wrapper for check_call with an extra printout. Now run(check=True) is the recommended method for this instead of check_call. What about making check= a required keyword argument for the wrapper? That would force it to be clearly included in every usage.

@joerick
Copy link
Contributor

joerick commented May 21, 2021

That would help, yeah. But I still don't understand the need to unify the two platforms run functions - they're different. The Windows one is quite weird, actually - it calls run() with an array of args and shell=True. This works because subprocess knows how to escape sequences of params when passing them to a shell on Windows. That isn't the case on Mac, nor does it need to be - the argument list remains an argument list, but just prefixed with ['/bin/sh', '-c']. So on POSIX, shell=True is different - the first argument of args becomes special - it's a shell script, not a command. But all other args are passed to it. So on POSIX, shell=True completely changes how an argument list is interpreted-

>>> subprocess.run(['/bin/echo', 'hello!'])
hello!
CompletedProcess(args=['/bin/echo', 'hello!'], returncode=0)
>>> subprocess.run(['/bin/echo', 'hello!'], shell=True)

CompletedProcess(args=['/bin/echo', 'hello!'], returncode=0)

Note the missing output on the second example. It's missing because the 'hello!' argument went to /bin/sh not /bin/echo.

(sorry for going deep, but I had to make sure I wasn't making a mountain out of a molehill!)

@henryiii
Copy link
Contributor Author

On windows, this is simulating what would be done if Windows didn't have the (linked) bug. If that bug was fixed, then you could remove the custom handling and it would look like the others. But currently, we have to pretend that shell=False even when we have to add shell=True. When you pass shell=True, you need to pass a string, not a list, and that's what's being done here.

@joerick
Copy link
Contributor

joerick commented Jul 11, 2021

Apologies for the delay in getting back to this @henryiii ! Probably I've been dragging my feet because I'm conflict-averse and I didn't see it as a high priority, but I do see the value in unifying some kind of wrapper, as we saw in #521. So I return to this with slightly more understanding of the why.

That said,

  1. I don't think check=True needs to be placed at every call site, when 99% of usages want that. A default of check=True makes sense to me (it's kind of crazy that this isn't the default in subprocess... but I digress)
  2. I think that executing commands (lists) are different from running shell scripts (strings). This is true in types, but also in behaviour. So we should have an API that reflects that. In particular, the shell=True flag that allows a list to be passed is problematic, because this has a different behaviour on POSIX versus windows. (this is why the original call and shell functions are separate)

It would be lovely to match the subprocess.run API for familiarity, but I think that the above concerns are more important, and we wouldn't have a wrapper if we weren't changing behaviour (logging plus the windows workaround), so there is some logic in changing the name.

I've gone around the houses on this, but settled on something quite neat I think. So my counter proposal might look something like:

def call(
    args: Sequence[PathOrStr],
    env: Optional[Dict[str, str]] = None,
    cwd: Optional[PathOrStr] = None,
) -> None:
    """
    Executes a command, like subprocess.run(..., check=True), but:

    - Prints the command before executing it
    - Works around a couple of bugs on Windows
    """

    print("+ " + " ".join(shlex.quote(str(a)) for a in args))

    # execute in a shell on Windows to workaround a PATH resolution bug
    # https://bugs.python.org/issue8557
    shell = sys.platform.startswith("win")

    # converts Paths to strings, due to Windows behavior at least on older Pythons.
    subprocess.run([str(a) for a in args], shell=shell, env=env, cwd=cwd, check=True)
  • I've used a different name, to deliberately differ from subprocess.run. Not wedded to call necessarily, but it makes sense to me and is consistent with existing code and docker.call
  • This is not a shell interface - it's just for running commands. I realised that we don't use the shell=True at all on the linux build, we instead do call(['sh', '-c', script]), which is actually better - it's completely explicit, and familiar, and the logs show how the script is invoked. So I'd propose that we do the same on macOS too, replacing shell() with call(['sh', '-c', script]). I'm not sure if this would be required on windows too, but if needed, there we could also do call(['cmd', '/c', script]). This gets us around all the nasty ambiguity of the shell argument having different behaviours and default values on different platforms.

Curious what you think @henryiii. I realise that this doesn't match your original idea, but I think maybe this would unify the API in a way that we can both agree on?

@henryiii
Copy link
Contributor Author

I was always fine going the other way too; I have not been a fan of the subprocess syntax since 2015. :)

The other direction would be to try to simply the call command to diverge farther from subprocess.run, say by accepting *args, smarter cwd default, etc

Since we are coming up with a new syntax, what about:

def call(
    *args: PathOrStr,
    env: Optional[Dict[str, str]] = None,
    cwd: Optional[PathOrStr] = None,
) -> None:

Adding the brackets each time isn't very necessary, and it allows this misusage of a plus sign:

call(my_args + ["--other"])

to be written like this:

call(*my_args, "--other")

@henryiii henryiii marked this pull request as draft September 20, 2021 15:11
mayeut added a commit to mayeut/cibuildwheel that referenced this pull request Dec 28, 2021
mayeut added a commit to mayeut/cibuildwheel that referenced this pull request Dec 28, 2021
mayeut added a commit to mayeut/cibuildwheel that referenced this pull request Dec 28, 2021
mayeut added a commit to mayeut/cibuildwheel that referenced this pull request Dec 29, 2021
mayeut added a commit to mayeut/cibuildwheel that referenced this pull request Dec 30, 2021
@mayeut mayeut closed this in #978 Jan 5, 2022
@henryiii henryiii deleted the henryiii/refactor/run2 branch June 6, 2024 15:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants