Skip to content

Utilize lowMemoryUnused #1103

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Mar 13, 2020
Merged

Utilize lowMemoryUnused #1103

merged 3 commits into from
Mar 13, 2020

Conversation

dcodeIO
Copy link
Member

@dcodeIO dcodeIO commented Feb 11, 2020

This PR investigates utilizing the lowMemoryUnused option provided by Binaryen, doing (techically unsafe) optimizations like

(i32.load
 (i32.add
  (local.get $0)
  (i32.const 16)
 )
)

to

(i32.load offset=16
 (local.get $0)
)

In order to do this, the first 1KB of memory must be guaranteed to be invalid, since the 32-bit addition in the first snippet can overflow while the offset= attribute cannot. As a result, pointer additions below the 1024 bytes threshold can be optimized.

On a first glimpse this doesn't help us a lot since the compiler already emits offset= for field accesses, with just a few, yet important cases where it helps, especially within std ArrayBuffer and Array (which is expected), making it more a micro optimization for speed than anything else.

@dcodeIO
Copy link
Member Author

dcodeIO commented Feb 11, 2020

From the fixtures it actually seems that the only place where this does something useful is in the memset helper, which we could as well optimize per hand, hmm.

@MaxGraey
Copy link
Member

I'm wondering could we decrease 1024 bytes to 256 bytes which potentially enough for most of usual cases but more economical in terms of RAM consumption which is pretty important for IoT for example? I saw binaryen hardcoded this values in enum

@dcodeIO
Copy link
Member Author

dcodeIO commented Feb 11, 2020

Yeah, the value is currently hardcoded, but I guess it wouldn't be too hard to make it configurable. Might well be that we can get away with a much smaller value, as this really just appears to affect manual unsafe code. Even if we'd have huge classes with lots of fields, these would already use offsets anyway.

Currently thinking that we'd not do this optimization at all if lowMemoryLimit (from the other PR) is present, because every single byte may count there.

@MaxGraey
Copy link
Member

Yeah, we have a lot of handcrafted optimizations and low-level stuffs but for external user this optimization pass could be useful. Also it close this most oldest issue at last! =)

@dcodeIO
Copy link
Member Author

dcodeIO commented Feb 11, 2020

Not so sure about the old issue. Seems to be quite limited to memset currently, while an array access goes through an overload that doesn't appear to benefit from the same effects. Reminds me that I wanted to make bindings for the inlining constants, since tweaking these might help.

@dcodeIO
Copy link
Member Author

dcodeIO commented Feb 11, 2020

PR for inlining limits: WebAssembly/binaryen#2655

@MaxGraey
Copy link
Member

That's inlining PR will be really useful for our needs!

@dcodeIO dcodeIO requested a review from MaxGraey March 13, 2020 18:19
@dcodeIO dcodeIO merged commit b7df27c into master Mar 13, 2020
@MaxGraey
Copy link
Member

@dcodeIO it seems this optimization pass significant slowdown compilation but with pretty small benefits at least for our tests so I suggest apply it only for optimizeLevelHint >= 3. wdyt?

@dcodeIO
Copy link
Member Author

dcodeIO commented Mar 14, 2020

Yeah, turned out that our tests are relatively immune to this because stdlib loads and stores are optimized already. More about user code where immOffset is not fully utilized. Fine with upping this to >=3. Do you have any numbers on how much slower it is, or a theory why?

@MaxGraey
Copy link
Member

I made pr for this: #1169

@MaxGraey
Copy link
Member

MaxGraey commented Mar 14, 2020

in that tuned PR tests run in 82,124 ms
currently it takes: 111,206 ms

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants