-
-
Notifications
You must be signed in to change notification settings - Fork 1.7k
Add Process.quote and fix shell usages in the compiler #9043
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Mmm no idea what's up with CI today. Surely some of these failures are not real |
|
I don't like the amount of |
|
github's CI is broken today, the others are fine. |
Feel free to rework all of these cases separately. I'm not adding a new problem with this. |
|
Looks good 👍 @RX14 This might be a bit controversial, but we could consider calling |
|
So the CI problem is real. Reproduced it locally with make
CC="cc -fuse-ld=lld" ./bin/crystal build --threads 1 --exclude-warnings spec/std --exclude-warnings spec/compiler -o .build/std_spec spec/std_spec.crNow the final linker command is passed all as one arg to the shell, and, being 162174 bytes long, it surpasses While this is cause to reconsider this approach, I'd like to point out that the limit on the commands is much much smaller on Windows so we probably want to limit this command's size anyway, for example by not linking so many separate files (almost 4000 in this case) at once. |
|
I backed out of the change that caused the CI failure, and what I'm guessing is the main source of worry from @RX14 's comment |
|
I realized that this Windows quoting method works only for CreateProcess, it's not good enough for the shell. I renamed the methods accordingly and :nodoc:'d the Windows-specific one. So I'll just keep this method reserved for internal use with |
|
So this change is no longer blocking Windows support, but it'd still be really nice to have. |
And apply it throughout the compiler for better safety.
| # Shell-quotes one item, same as `quote({arg})`. | ||
| def self.quote(arg : String) : String | ||
| quote({arg}) | ||
| end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| # Shell-quotes one item, same as `quote({arg})`. | |
| def self.quote(arg : String) : String | |
| quote({arg}) | |
| end | |
| # Shell-quotes given items, same as `quote(*args)`. | |
| def self.quote(*args : String) : String | |
| quote(args) | |
| end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can be sure that I considered this option, it's not like I didn't know about it.
But I'm wary about the effect of proliferating these splats on the compilation times.
And I also want to stress the case of quoting one item as something that one may need specifically. If there are multiple items, it's nice to make the user think, is this a List situation or is it a Tuple situation.
I don't know. I don't feel that strongly about this though.
| args.join(' ') do |arg| | ||
| if arg.empty? | ||
| "''" | ||
| elsif arg.matches? %r([^a-zA-Z0-9%+,\-./:=@_]) # not all characters are safe, needs quoting |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIRC % and : can have special meaning in posix shell.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not as far as I'm aware.
There's the command ":" but that's not affecting anything.
And oh huh, guess if one wanted to run an executable called "%" (which is in PATH) in Bash, they wouldn't be able to. That's Bash though, not POSIX.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the following may need to be quoted under certain circumstances. That is, these characters may be special depending on conditions described elsewhere in this volume of IEEE Std 1003.1-2001:
* ? [ # ˜ = %
https://pubs.opengroup.org/onlinepubs/007904875/utilities/xcu_chap02.html
It doesn't give more detail about the circumstances. So maybe it's okay.
|
Just found in the wild |
| describe "quote_windows" do | ||
| it { Process.quote_windows("").should eq %("") } | ||
| it { Process.quote_windows(" ").should eq %(" ") } | ||
| it { Process.quote_windows(orig = "%hi%").should eq orig } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this really correct? Wouldn't %hi% expand an environment variable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# NOTE: This is **not** safe to pass to the CMD shell.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add this more clearly to the method documentation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, maybe I just didn't understand it correctly what's already there. So you can leave it as is unless someone else finds this should be more detailled.
| quote_posix({arg}) | ||
| end | ||
|
|
||
| # :nodoc: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this method nodoc? Even if it might have less use cases, I think it should still be considered documented.
|
Just found some independent "research" that matches my finding that it's impossible to safely quote for CMD shell. So it's good that I'm not attempting to solve that here. The quote_windows function here is still really useful to be available. Realized another example. Tools can sometimes accept the command line from a file instead. crystal/src/compiler/crystal/compiler.cr Lines 341 to 342 in 2cbe65b
and this is not specific just to Crystal's implementation |
|
I like the idea of making the compiler escape interpolation between If we eventually want to do that we would need in this PR a I guess this will come handle here also #8900 (comment) I find escape more appropiate than quote. Since it will quote if needed. |
|
FYI: I've been using a |
|
Well, I don't think this PR is blocked on any such future consideration. I didn't want to name it "escape" because I don't feel it's appropriate for the multi-argument usage. Important to insert quotes in between them, not escaping the spaces. I don't feel that strongly about it but maybe at least one more vote that it's good to change from |
|
I do appreciate the perspective that the action here is escaping but it just happens to be implemented through quoting. I may also be biased by Python using the name "quote". Still, though, for Windows this must be done by quoting. And the name "escape" may give a false sense of security that it's safe to use in the shell. |
|
Please consider this for 0.35.0, as I think some follow-up ideas voiced here can't be done before there's a released compiler with this. |
|
... another conflict =) |
This comment has been minimized.
This comment has been minimized.
|
I'm a bit lost with something. Why is |
|
But it is used! in src/crystal/system/win32/process.cr Not for posix though, the concerns there are indeed separate. |
|
For a fuller explanation: A command can be passed either as a list of args or as one string (this is what the Because the recommended way to pass args is a list of args, we already have the implementation to convert them to a single string on Windows; this PR only moves it to a public location. For POSIX there's no conversion required, it's already a list. However, since we do also have a way to specify a command as one string (for which on Windows the conversion is no-op), we need special support for that on POSIX. We defer that to the shell, and that implementation has been there since the beginning of Crystal. The reason that the "quote" functions should be public, even on POSIX, is for the use case where one wants "the best of both worlds": conveniently specify an arbitrary string-based command, but also safely intersperse a list of arguments into it. That needs to be used by the caller beforehand, and is never needed by any internal implementation in |
|
Thanks for the clarification @oprypin |
| .reverse! | ||
| .skip(10) | ||
| .each { |name| `rm -rf "#{name}"` rescue nil } | ||
| .each { |name| `rm -rf -- #{Process.quote(name)}` rescue nil } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is the -- needed/introduced here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh. Well it's not directly related to the overall theme of quoting, but does fix a security-related issue. If there was a file named just "-i", rm would see it as a flag, and the quoting wouldn't help either. I don't think it can actually happen here because cache names are controlled by Crystal, but still good practice to always add the double dash to indicate the end of flags.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not just do Dir.delete instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unrelated change
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know, so is it adding the -- then, but I'm not saying you have to do it in this PR. Just asking if there is any reason this was made this way. You were doing the analysis of parts of the code that require Process.quote and it could be fixed in a different way instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I could've split it into a separate commit but they're ignored under the current merging process.
I could've split it into a separate PR, but the process overhead (mainly, but not exclusively, due to GitHub) is too much.
So I just ignore it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So but we could/should have a PR that changes this to Dir.delete?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it'd be good. But I am not interested in sending that out. Maybe the whole codebase could be swept for such things.
| end | ||
|
|
||
| private def self.quote_windows(io : IO, args) | ||
| args.join(' ', io) do |arg| |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aaaa I missed the argument swap
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we fail CI on warnings? Or maybe we already do on non-Windows
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No because there are tests that check that the deprecated methods still work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But we --exclude-warnings spec/std. Maybe it's not 100% reliable though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this should be improved. Not sure if it's not reliable or just not configured properly...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that the only reliable way is to fail on warnings on std_spec, and add deprecated_spec target were deprecations can happen. That way we can error on std_spec.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Handled in #9369
Add a method to produce a command line string from a list of args.
Examples:
Yes, all of these examples could've been written better without
shell: truebut the reality is that sometimes one needs to intermix shell handling and non-shell handling, the best example of which islink_flagsin the compiler, which I am also touching here (search for it in the diff!).The source code of
Process.quote_windowsis required for the upcomingProcess.newport on Windows.Sure, it doesn't have to be public for that case, but I found another use case for it in the compiler (search for it in the diff!), so why not publish it.
I'm intentionally not hiding
quote_posixin something likeCrystal::System::Processbecause it can have more uses (in fact, I used it immediately in the compiler here). E.g. one might want to tell a user to run a Bash command specifically and make sure it's correct and copy-pasteable, and there's no reason that the program should not be able to do that on Windows.Both
quote_posixandquote_windowsare available on all systems, thenquotejust selects for the current OS.Removing the usage ofProcess.new(, args, shell: true)from the compiler is also a requirement for Windows porting, as this combination won't be supported.Additionally, I did a quick search throughout the codebase for shell misuses which would surely fail eventually (with likely security vulnerabilities!) if any atypical path was involved, and applied the new method there. This just tends to happen more often on Windows, which made me notice.