Skip to content

[wasm-opt] Split functions that break the limit #2308

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
surma opened this issue Aug 27, 2019 · 12 comments
Open

[wasm-opt] Split functions that break the limit #2308

surma opened this issue Aug 27, 2019 · 12 comments

Comments

@surma
Copy link
Contributor

surma commented Aug 27, 2019

All browsers (and Node) seem to implement the same size limit for functions (somewhere around the 7MB mark).

How about a pass for wasm-opt that breaks long functions up into smaller pieces? AFAICT, this should be doable by partitioning the big function and potentially passing the state of locals from one function to the next.

WDYT?

@kripken
Copy link
Member

kripken commented Aug 27, 2019

This might be nice to add, yeah. It's not trivial, though - breaking up a function is quite hard in general (handling loops, etc., avoiding breaking on a boundary that is called across a lot, that has few locals that need copying, etc.). Older emscripten had an "outlining" pass for this, which did something similar, and it was very hard to get right.

Perhaps it's easier to convince browsers to raise the 7MB limit ;) did you have a specific size that you hit this with?

@MaxGraey
Copy link
Contributor

MaxGraey commented Aug 27, 2019

Fun fact. All browsers have same limit equivalent to 7654321 bytes)
v8
WebKit
Gecko

@andreaTP
Copy link

Thanks for the amazing project and sorry for bumping such an old issue.

In Chicory we do hit the JVM limits for a single method size.

Having a way to arbitrary split huge Wasm functions(e.g. the CPython interpreter loop) is extremely appealing to us.
Now the questions:

  • is there any progress in this repo on the subject?
  • are there reference implementations? (e.g. the mentioned emscripten impl) and/or documentation/material around it?

@kripken
Copy link
Member

kripken commented Feb 19, 2025

There isn't specific progress, though there is work on an outlining pass which might end up reducing code size. It's not intended to handle this situation, though (in particular it might not work on code with branches, locals, etc.).

Is there no JVM flag to avoid the issue for you? If not, is this due to machine-generated code perhaps, that can be adjusted (that is usually the case in emscripten bug reports)?

If not, options could be

  • If this is due to very long blocks, a simple splitting pass could be written. It would just move parts of functions out, and copy/restore locals along those calls. That's all the old emscripten code did, but it's enough.
  • If this is due to nesting, perhaps the outlining pass could be generalized for this.

@andreaTP
Copy link

Thanks a lot @kripken for getting back!

Is there no JVM flag to avoid the issue for you?

Unfortunately not, this is a hard limit very deeply encoded.

If not, is this due to machine-generated code perhaps, that can be adjusted

Do you mean that it's the compiler itself that should be able to split the function in first place?
Considering the example of CPython are you suggesting opening an issue in the wasi-sdk repo?

  • If this is due to very long blocks ...
  • If this is due to nesting ...

Given the data points that I have, both applies, the issue is often triggered by Go code compiled to WASM while, usually, with C or Rust code wasm-opt is already reducing the size of the methods enough to avoid the limit.

@kripken
Copy link
Member

kripken commented Feb 19, 2025

Do you mean that it's the compiler itself that should be able to split the function in first place?

Often that is the case, yes, if it is autogenerated source code. Adjusting the autogenerator to emit smaller functions is often possible, at least in the bug reports emscripten gets about this.

But it sounds like you see this on CPython compiled by clang? That would mean an issue in LLVM itself then, likely with no such simple fix.

Does wasm-opt -O3 or -Oz not help here? If not, please attach the wasm file, as the shape will determine what I can recommend.

@andreaTP
Copy link

Thanks again for the feedback!

Does wasm-opt -O3 not help here?

It does in most cases(e.g. with CPython), I'll update this issue as soon as I have a new example (probably ruby/prism is a good starting point).

at least in the bug reports emscripten gets about this

Do you have a link to spare so that I can see how the process goes in emscripten? 🙏

@kripken
Copy link
Member

kripken commented Feb 19, 2025

Searching by the error is probably best if you want to see old issues. The error on such huge functions is typically due to the size in bytes, e.g.

emscripten-core/emscripten#16690

or the number of locals,

emscripten-core/emscripten#18159
emscripten-core/emscripten#19346
emscripten-core/emscripten#21847

@andreaTP
Copy link

Ok, found 10 minutes to provide one example reproducer with Prism:

I'll try to provide something based on Go soon.

@andreaTP
Copy link

Found an easily inspectable Go package!

In this case wasm-opt takes a super long time to execute and doesn't reduce much the size:

11341875 Feb 20 18:10 esbuild-opt.wasm
12118710 Oct 26  1985 esbuild.wasm

@kripken
Copy link
Member

kripken commented Feb 20, 2025

The prism function pm_dump_json is very large indeed, here is --func-metrics:

func: pm_dump_json
 [binary-bytes] : 55984   
 [total]        : 24795   
 [vars]         : 5       
 Binary         : 3314    
 Block          : 804     
 Break          : 929     
 Call           : 2532    
 Const          : 4473    
 GlobalGet      : 696     
 GlobalSet      : 2       
 Load           : 1770    
 LocalGet       : 7575    
 LocalSet       : 1581    
 Loop           : 36      
 Store          : 765     
 Switch         : 1       
 Unary          : 317     

Looking at it, those 804 blocks are deeply nested, with lots of code in the tails, a typical switch pattern,

(block
  (block
    (block
      ..
      code1
      br
    )
    code2
    br
  )
  code3
  br
)

Splitting code1,2,3 out requires pulling code out of nested blocks in the middle, and handling locals and branches in those places.

@tlively @ashleynh can the current Outlining pass do that? Note that the goal here is just to split up, not to find common code patterns for code size reasons.

@tlively
Copy link
Member

tlively commented Feb 20, 2025

We can't outline anything that uses locals or has control flow out of the outlined region at the moment. There also isn't a great interface for telling the outliner to outline a particular sequence of expressions yet.

I think the simplest and most general thing to do here would be to turn each basic block in the huge function into a separate function and turn branches into tail calls. It would be possible to get fancier and try to optimize the partitioning of the CFG, but that's what I would start with.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants