-
Notifications
You must be signed in to change notification settings - Fork 208
Call populate_args
only if we actually need command-line arguments
#112
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
I'm not huge fan of depending on the compiler workaround in this way. Especially since using a wrapper function like this is not the only way to implement to workaround in the compiler (I would have preferred to simply re-write main's signature in place). I'm also not convinced that using the presence of the args in the main signature really tells us whether the args are used by the program, but I guess this will catch at least some such users. But if this helps deal with some of the code size issues raised in WebAssembly/WASI#109 then.. maybe? |
This isn't related to the discussion in Rewriting main's signature in place isn't easy, because C programs can call their own
argv is the only way to get the arguments in C/C++, so if it's not present, arguments aren't being used. This will break three-arg main users, however neither POSIX nor ISO requires any implementation to support that, and such users will get a signature-mismatch warning from the linker, and it's easy to fix in a portable way. |
What I meant was that there will programs that declare a two argument main but never reference the arguments. So this method is conservative. |
We are actively looking at moving the |
True. A natural place to implement that last piece of the optimization would be in LLVM, around where WebAssemblyFixFunctionBitcasts.cpp is already detecting the "main" function. That leads me to:
That won't break |
3c70d3a
to
49985e0
Compare
I split out the first patch from this PR in #118 for easier reviewing. |
49985e0
to
f0b8708
Compare
#118 is now landed, and this PR is now rebased on top of it, so it's a simpler PR now! |
This avoids linking in the argv/argc initialization code, and the __wasi_args_sizes_get and __wasi_args_get imports, in programs that don't use command-line arguments. The way this works is, if the user writes `int main(int argc, char *argv[])`, the argument initialization code is loaded, and if they write `int main(void)`, it's not loaded. This promotes the `__original_main` mechanism into an effective contract between the compiler and libc, which wasn't its original purpose, however it seems to fit this purpose quite well.
And the CI now passes :-). |
@@ -0,0 +1,53 @@ | |||
#include <wasi/core.h> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This filename now seems misleading. How about __wasilibc_original_main
or at least something with main
in it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. How about __original_main.c
, since that's the main symbol it defines?
Edit: no pun intended, oof.
f0b8708
to
f355f8e
Compare
libc-bottom-half/crt/crt1.c
Outdated
int r = main(argc, argv); | ||
// Call `__original_main` which will either be a compiler-synthesized | ||
// function which calls `main` with no arguemnts, or a libc routine | ||
// which populates `argv` and `argc` and calls `main` with them. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this comment accurate?
Is this more like.. ".. will either be the application's zero-argument main function (renamed by the compiler) or a libc routine which populates argv
and argc
and calls the application's two-argument main
."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I guess I was trying to keep it abstract here, but it isn't essential. I've now updated the comment to your suggested wording.
Both of those tests deliberately execute a use-after-free, intending to test asan functionality, but if I understand how the waterfall works, it's not testing asan here, so those aren't meaningful regressions. |
Ah you're right. We shouldn't run or care about the results of those test cases for sanitizers after all actually... |
We don't run sanitizers on them, so all sanitizer tests (ubsan, asan, tsan, ...) may start to fail at any point, and we shouldn't care about the results. For now I update the expectation to match the current behavior, but if this happens frequently enough, we may need to fix the runner script so that we can skip those tests or something. The two asan tests started to fail after WebAssembly/wasi-libc#112, just for reference.
This avoids linking in the argv/argc initialization code,
and the __wasi_args_sizes_get and __wasi_args_get imports, in
programs that don't use command-line arguments. The way this works is,
if the user writes
int main(int argc, char *argv[])
, the argumentinitialization code is loaded, and if they write
int main(void)
,it's not loaded.
This promotes the
__original_main
mechanism into an effective contractbetween the compiler and libc, which wasn't its original purpose,
however it seems to fit this purpose quite well.