-
Notifications
You must be signed in to change notification settings - Fork 777
[wasm-opt] Split functions that break the limit #2308
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
This might be nice to add, yeah. It's not trivial, though - breaking up a function is quite hard in general (handling loops, etc., avoiding breaking on a boundary that is called across a lot, that has few locals that need copying, etc.). Older emscripten had an "outlining" pass for this, which did something similar, and it was very hard to get right. Perhaps it's easier to convince browsers to raise the 7MB limit ;) did you have a specific size that you hit this with? |
Thanks for the amazing project and sorry for bumping such an old issue. In Chicory we do hit the JVM limits for a single method size. Having a way to arbitrary split huge Wasm functions(e.g. the CPython interpreter loop) is extremely appealing to us.
|
There isn't specific progress, though there is work on an outlining pass which might end up reducing code size. It's not intended to handle this situation, though (in particular it might not work on code with branches, locals, etc.). Is there no JVM flag to avoid the issue for you? If not, is this due to machine-generated code perhaps, that can be adjusted (that is usually the case in emscripten bug reports)? If not, options could be
|
Thanks a lot @kripken for getting back!
Unfortunately not, this is a hard limit very deeply encoded.
Do you mean that it's the compiler itself that should be able to split the function in first place?
Given the data points that I have, both applies, the issue is often triggered by Go code compiled to WASM while, usually, with C or Rust code |
Often that is the case, yes, if it is autogenerated source code. Adjusting the autogenerator to emit smaller functions is often possible, at least in the bug reports emscripten gets about this. But it sounds like you see this on CPython compiled by clang? That would mean an issue in LLVM itself then, likely with no such simple fix. Does |
Thanks again for the feedback!
It does in most cases(e.g. with
Do you have a link to spare so that I can see how the process goes in emscripten? 🙏 |
Searching by the error is probably best if you want to see old issues. The error on such huge functions is typically due to the size in bytes, e.g. emscripten-core/emscripten#16690 or the number of locals, emscripten-core/emscripten#18159 |
Ok, found 10 minutes to provide one example reproducer with Prism:
I'll try to provide something based on Go soon. |
Found an easily inspectable Go package!
In this case
|
The prism function
Looking at it, those 804 blocks are deeply nested, with lots of code in the tails, a typical switch pattern, (block
(block
(block
..
code1
br
)
code2
br
)
code3
br
) Splitting code1,2,3 out requires pulling code out of nested blocks in the middle, and handling locals and branches in those places. @tlively @ashleynh can the current Outlining pass do that? Note that the goal here is just to split up, not to find common code patterns for code size reasons. |
We can't outline anything that uses locals or has control flow out of the outlined region at the moment. There also isn't a great interface for telling the outliner to outline a particular sequence of expressions yet. I think the simplest and most general thing to do here would be to turn each basic block in the huge function into a separate function and turn branches into tail calls. It would be possible to get fancier and try to optimize the partitioning of the CFG, but that's what I would start with. |
All browsers (and Node) seem to implement the same size limit for functions (somewhere around the 7MB mark).
How about a pass for
wasm-opt
that breaks long functions up into smaller pieces? AFAICT, this should be doable by partitioning the big function and potentially passing the state of locals from one function to the next.WDYT?
The text was updated successfully, but these errors were encountered: