Skip to content

Conversation

@oprypin
Copy link
Member

@oprypin oprypin commented Apr 11, 2020

Add a method to produce a command line string from a list of args.
Examples:

files = ["my file.txt", "another.txt"]
`grep -E 'fo+' -- #{Process.quote(files)}`

Process.quote_posix(files) # => "'my file.txt' another.txt"

Process.quote_windows({ %q(C:\"foo" project.txt) }) # => %q("C:\\\"foo\" project.txt")

Yes, all of these examples could've been written better without shell: true but the reality is that sometimes one needs to intermix shell handling and non-shell handling, the best example of which is link_flags in the compiler, which I am also touching here (search for it in the diff!).


The source code of Process.quote_windows is required for the upcoming Process.new port on Windows.
Sure, it doesn't have to be public for that case, but I found another use case for it in the compiler (search for it in the diff!), so why not publish it.

I'm intentionally not hiding quote_posix in something like Crystal::System::Process because it can have more uses (in fact, I used it immediately in the compiler here). E.g. one might want to tell a user to run a Bash command specifically and make sure it's correct and copy-pasteable, and there's no reason that the program should not be able to do that on Windows.

Both quote_posix and quote_windows are available on all systems, then quote just selects for the current OS.


Removing the usage of Process.new(, args, shell: true) from the compiler is also a requirement for Windows porting, as this combination won't be supported.

Additionally, I did a quick search throughout the codebase for shell misuses which would surely fail eventually (with likely security vulnerabilities!) if any atypical path was involved, and applied the new method there. This just tends to happen more often on Windows, which made me notice.

@oprypin
Copy link
Member Author

oprypin commented Apr 11, 2020

Mmm no idea what's up with CI today. Surely some of these failures are not real

@RX14
Copy link
Member

RX14 commented Apr 11, 2020

I don't like the amount of Process.shell_quote used everywhere. It feels sort of like manual SQL escaping: super easy to miss one, and overly verbose. That was replaced with prepared queries, which is basically what Process.new("command", args, shell: true) does: you can use $1, $2, $@ etc.

@RX14
Copy link
Member

RX14 commented Apr 11, 2020

github's CI is broken today, the others are fine.

@oprypin
Copy link
Member Author

oprypin commented Apr 11, 2020

don't like the amount of Process.shell_quote used everywhere

Feel free to rework all of these cases separately. I'm not adding a new problem with this.

@straight-shoota
Copy link
Member

Looks good 👍

@RX14 This might be a bit controversial, but we could consider calling shell_quote implicitly for string interpolation in ` literals.
That's probably the intended usage in most cases. Constructing shell commands with unquoted components would no longer work this way. For that you'd have to manually construct the command string first.
It would also only apply to ` literals, leaving similar methdods like Process.run, system etc. behave differently. This could cause unnoticed security issues when switching from "safe" ` literals to other methods because the same code would implicitly turn unsafe.

@oprypin
Copy link
Member Author

oprypin commented Apr 11, 2020

So the CI problem is real. Reproduced it locally with

make
CC="cc -fuse-ld=lld"  ./bin/crystal build --threads 1  --exclude-warnings spec/std --exclude-warnings spec/compiler -o .build/std_spec spec/std_spec.cr

Now the final linker command is passed all as one arg to the shell, and, being 162174 bytes long, it surpasses MAX_ARG_STRLEN (131072). This limitation is separate from ARG_MAX (2092454) for the sum of all lengths.

While this is cause to reconsider this approach, I'd like to point out that the limit on the commands is much much smaller on Windows so we probably want to limit this command's size anyway, for example by not linking so many separate files (almost 4000 in this case) at once.

@oprypin
Copy link
Member Author

oprypin commented Apr 11, 2020

I backed out of the change that caused the CI failure, and what I'm guessing is the main source of worry from @RX14 's comment

@oprypin oprypin changed the title Add Process.shell_quote and fix shell usages in the compiler Add Process.quote and fix shell usages in the compiler Apr 12, 2020
@oprypin
Copy link
Member Author

oprypin commented Apr 12, 2020

I realized that this Windows quoting method works only for CreateProcess, it's not good enough for the shell.
There actually doesn't seem to be any escaping that's always good enough for CMD, e.g. there's no way to insert a newline.

I renamed the methods accordingly and :nodoc:'d the Windows-specific one.

So I'll just keep this method reserved for internal use with shell: true, which on Windows I intend to not actually put through any shell. We can postpone this PR.

@oprypin
Copy link
Member Author

oprypin commented Apr 14, 2020

So this change is no longer blocking Windows support, but it'd still be really nice to have.

And apply it throughout the compiler for better safety.
Comment on lines +21 to +24
# Shell-quotes one item, same as `quote({arg})`.
def self.quote(arg : String) : String
quote({arg})
end
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# Shell-quotes one item, same as `quote({arg})`.
def self.quote(arg : String) : String
quote({arg})
end
# Shell-quotes given items, same as `quote(*args)`.
def self.quote(*args : String) : String
quote(args)
end

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can be sure that I considered this option, it's not like I didn't know about it.

But I'm wary about the effect of proliferating these splats on the compilation times.

And I also want to stress the case of quoting one item as something that one may need specifically. If there are multiple items, it's nice to make the user think, is this a List situation or is it a Tuple situation.

I don't know. I don't feel that strongly about this though.

args.join(' ') do |arg|
if arg.empty?
"''"
elsif arg.matches? %r([^a-zA-Z0-9%+,\-./:=@_]) # not all characters are safe, needs quoting
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIRC % and : can have special meaning in posix shell.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not as far as I'm aware.

There's the command ":" but that's not affecting anything.
And oh huh, guess if one wanted to run an executable called "%" (which is in PATH) in Bash, they wouldn't be able to. That's Bash though, not POSIX.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the following may need to be quoted under certain circumstances. That is, these characters may be special depending on conditions described elsewhere in this volume of IEEE Std 1003.1-2001:

* ? [ # ˜ = %

https://pubs.opengroup.org/onlinepubs/007904875/utilities/xcu_chap02.html

It doesn't give more detail about the circumstances. So maybe it's okay.

@oprypin
Copy link
Member Author

oprypin commented Apr 24, 2020

describe "quote_windows" do
it { Process.quote_windows("").should eq %("") }
it { Process.quote_windows(" ").should eq %(" ") }
it { Process.quote_windows(orig = "%hi%").should eq orig }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this really correct? Wouldn't %hi% expand an environment variable?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  # NOTE: This is **not** safe to pass to the CMD shell.

#9043 (comment)

Copy link
Member

@straight-shoota straight-shoota Apr 24, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add this more clearly to the method documentation.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, maybe I just didn't understand it correctly what's already there. So you can leave it as is unless someone else finds this should be more detailled.

quote_posix({arg})
end

# :nodoc:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this method nodoc? Even if it might have less use cases, I think it should still be considered documented.

@oprypin
Copy link
Member Author

oprypin commented May 5, 2020

Just found some independent "research" that matches my finding that it's impossible to safely quote for CMD shell.
https://github.com/actions/toolkit/blob/6a744be7ed9bab4b95ca7600529b7ce5ced1e54a/packages/exec/src/toolrunner.ts#L237

So it's good that I'm not attempting to solve that here. The quote_windows function here is still really useful to be available. Realized another example. Tools can sometimes accept the command line from a file instead. cl.exe is one. So if we were to refactor the code forming linker flags to keep them as an array until the end to pass them to the subprocess, we'd still have to explicitly use this function to convert them to a string here

File.write(args_filename, args_bytes)
cmd = "#{CL} @#{args_filename}"

and this is not specific just to Crystal's implementation

@bcardiff
Copy link
Member

bcardiff commented May 6, 2020

I like the idea of making the compiler escape interpolation between `#{expr}`, yet that requires first this PR and then the compiler logic in a separate release

If we eventually want to do that we would need in this PR a ShellString type to mark the safe strings. Making them safe to concat directly in `#{expr}`, otherwise they need to be escaped. This is similar to html template engines where the user is not forced to explicitly escape interpolated values. I haven't ported this kind of design to Crystal yet, not sure if there would be any surprised along the road.

I guess this will come handle here also #8900 (comment)

I find escape more appropiate than quote. Since it will quote if needed.

@straight-shoota
Copy link
Member

FYI: I've been using a SafeString type with crinja. Works pretty well. Obviously it can be a bit confusing when you need to take care of strings that are not String. Maybe a Stringlike module could be an improvement to that at some point. But for now it's not that bad and in the context of shell escaping it would probably be restricted to a relative tight scope.

@oprypin
Copy link
Member Author

oprypin commented May 6, 2020

Well, I don't think this PR is blocked on any such future consideration.

I didn't want to name it "escape" because I don't feel it's appropriate for the multi-argument usage. Important to insert quotes in between them, not escaping the spaces.

I don't feel that strongly about it but maybe at least one more vote that it's good to change from quote to escape

@oprypin
Copy link
Member Author

oprypin commented May 6, 2020

I do appreciate the perspective that the action here is escaping but it just happens to be implemented through quoting.

I may also be biased by Python using the name "quote".

Still, though, for Windows this must be done by quoting. And the name "escape" may give a false sense of security that it's safe to use in the shell.

@oprypin
Copy link
Member Author

oprypin commented May 19, 2020

Please consider this for 0.35.0, as I think some follow-up ideas voiced here can't be done before there's a released compiler with this.
This can also be used in Shards nicely (and it needs Windows work too).
And it's a nice security fix in general.

@straight-shoota
Copy link
Member

... another conflict =)

@oprypin

This comment has been minimized.

@bcardiff
Copy link
Member

I'm a bit lost with something. Why is Process.quote not used in Crystal::System::Process.prepare_args ? Wouldn't that be desired?

@oprypin
Copy link
Member Author

oprypin commented May 26, 2020

But it is used! in src/crystal/system/win32/process.cr

Not for posix though, the concerns there are indeed separate.

@oprypin
Copy link
Member Author

oprypin commented May 26, 2020

For a fuller explanation:

A command can be passed either as a list of args or as one string (this is what the Process API exposes).
On POSIX the native format is a list of args. On Windows the native format is one string.

Because the recommended way to pass args is a list of args, we already have the implementation to convert them to a single string on Windows; this PR only moves it to a public location.

For POSIX there's no conversion required, it's already a list.

However, since we do also have a way to specify a command as one string (for which on Windows the conversion is no-op), we need special support for that on POSIX. We defer that to the shell, and that implementation has been there since the beginning of Crystal.

The reason that the "quote" functions should be public, even on POSIX, is for the use case where one wants "the best of both worlds": conveniently specify an arbitrary string-based command, but also safely intersperse a list of arguments into it. That needs to be used by the caller beforehand, and is never needed by any internal implementation in Process on POSIX.

@bcardiff
Copy link
Member

Thanks for the clarification @oprypin

@bcardiff bcardiff added this to the 0.35.0 milestone May 26, 2020
.reverse!
.skip(10)
.each { |name| `rm -rf "#{name}"` rescue nil }
.each { |name| `rm -rf -- #{Process.quote(name)}` rescue nil }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is the -- needed/introduced here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh. Well it's not directly related to the overall theme of quoting, but does fix a security-related issue. If there was a file named just "-i", rm would see it as a flag, and the quoting wouldn't help either. I don't think it can actually happen here because cache names are controlled by Crystal, but still good practice to always add the double dash to indicate the end of flags.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not just do Dir.delete instead?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unrelated change

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know, so is it adding the -- then, but I'm not saying you have to do it in this PR. Just asking if there is any reason this was made this way. You were doing the analysis of parts of the code that require Process.quote and it could be fixed in a different way instead.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could've split it into a separate commit but they're ignored under the current merging process.
I could've split it into a separate PR, but the process overhead (mainly, but not exclusively, due to GitHub) is too much.
So I just ignore it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So but we could/should have a PR that changes this to Dir.delete?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it'd be good. But I am not interested in sending that out. Maybe the whole codebase could be swept for such things.

@bcardiff bcardiff merged commit d60a1db into crystal-lang:master May 28, 2020
end

private def self.quote_windows(io : IO, args)
args.join(' ', io) do |arg|
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aaaa I missed the argument swap

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we fail CI on warnings? Or maybe we already do on non-Windows

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No because there are tests that check that the deprecated methods still work.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But we --exclude-warnings spec/std. Maybe it's not 100% reliable though.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this should be improved. Not sure if it's not reliable or just not configured properly...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that the only reliable way is to fail on warnings on std_spec, and add deprecated_spec target were deprecations can happen. That way we can error on std_spec.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Handled in #9369

bcardiff added a commit to bcardiff/crystal that referenced this pull request May 28, 2020
bcardiff pushed a commit that referenced this pull request May 28, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants