-
Notifications
You must be signed in to change notification settings - Fork 193
Rewrite inlining pass #1935
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rewrite inlining pass #1935
Conversation
840420d
to
7b64a79
Compare
79446f9
to
ba1a622
Compare
b62b39e
to
5cb6652
Compare
I've pushed a fixup to the testsuite. |
We need a changelog entry |
I'm not certain I read the benchmark correctly. |
Maybe we can wait for #1962 to get better measurements. |
We don't have the latest benchmark. With this PR, we seem to double the time spent in inline. We can probably live with that. |
10a1ba8
to
6aaf9ad
Compare
|
Right, the aggressive inlining of functors does not really seem to result into any runtime improvement with js_of_ocaml. So it is not enabled only with wasm_of_ocaml. |
I've pushed commits to only inline (small) functors in o3 with jsoo. Let's wait for the benchmarks |
13f7293
to
4ac6713
Compare
fannkuch_redux and fft seem to take longer now. Can you take a look ? Compilation time increase everywhere but I guess we could live with that given recent improvement everywhere else |
For
Inlining small functions make a significant difference for |
Are you ok to merge in the current state ? |
@TyOverby, any update on this ? |
We've been trying to import these changes (well, really the base revision so that we have a good point to compare benchmarks with) and have hit a very large number of conflicts with our internal patches due to the recent PRs that have been merged. I think we're close to being ready to test this PR, my guess is next week. |
I would prefer to wait for some feedbacks from Ty. |
We were able to run pull this in internally and performance looks very good! Substantially faster and more consistent on PRT and our other internal benchmarks. For Bonsai benchmarks we are seeing 50%-80% reduction in benchmarking times. Binary size looks like <1% increase in separate compilation and 0-2% in whole program. There are a couple outlier programs that increase in the 10-16% range. We've reached out about a miscompilation issue on Slack and initially we believe it's unrelated to this PR directly, but it looks like applying this patch actually causes a very similar miscompilation in a program that didn't have it before. This one new case is the only test we have failing, and I suspect that if we resolve the minimal repro for the original issue, then we might also see how to resolve this new instance in this PR? |
For the bonsai benchmarks, I suspect that the large improvements are due to the inlining-related memory leak being resolved by this PR |
Anything to say on compilation times? |
- We are a lot more aggressive at inlining functor-like functions in wasm_of_ocaml, since this may enable further optimizations - We are more cautious at inlining nested functions, since this can result in memory leaks - We inline a larger class of small functions
Let's merge and move from there. |
No description provided.