Track indirect call types in RemoveUnusedModuleElements #7728

kripken · 2025-07-15T22:51:09Z

An indirect call to a type in a table now only forces functions of
that type to be marked as used: functions of other types are
left alone, potentially leaving them unreached.

This is more precise than assuming any indirect call can
reach anywhere, which is more or less what we did before.

There is a downside to this: the pass is around 10% slower. This is one
of our faster passes, so this may be acceptable, however.

This has some benefits, here is the Emscripten diff:

Example


diff --git a/test/code_size/embind_val_wasm.json b/test/code_size/embind_val_wasm.json
index 939ad737b3..1c8b8a1546 100644
--- a/test/code_size/embind_val_wasm.json
+++ b/test/code_size/embind_val_wasm.json
@@ -1,10 +1,10 @@
 {
   "a.html": 552,
   "a.html.gz": 373,
   "a.js": 5356,
   "a.js.gz": 2526,
-  "a.wasm": 7468,
-  "a.wasm.gz": 3461,
-  "total": 13376,
-  "total_gz": 6360
+  "a.wasm": 5831,
+  "a.wasm.gz": 2713,
+  "total": 11739,
+  "total_gz": 5612
 }
diff --git a/test/code_size/random_printf_wasm.json b/test/code_size/random_printf_wasm.json
index 89da22d7c8..9685b59d93 100644
--- a/test/code_size/random_printf_wasm.json
+++ b/test/code_size/random_printf_wasm.json
@@ -1,6 +1,6 @@
 {
-  "a.html": 12511,
-  "a.html.gz": 6848,
-  "total": 12511,
-  "total_gz": 6848
+  "a.html": 12507,
+  "a.html.gz": 6822,
+  "total": 12507,
+  "total_gz": 6822
 }
diff --git a/test/code_size/random_printf_wasm2js.json b/test/code_size/random_printf_wasm2js.json
index 5b21705c95..7d168dbd6a 100644
--- a/test/code_size/random_printf_wasm2js.json
+++ b/test/code_size/random_printf_wasm2js.json
@@ -1,6 +1,6 @@
 {
-  "a.html": 17224,
-  "a.html.gz": 7551,
-  "total": 17224,
-  "total_gz": 7551
+  "a.html": 17229,
+  "a.html.gz": 7542,
+  "total": 17229,
+  "total_gz": 7542
 }
diff --git a/test/other/codesize/test_codesize_files_wasmfs.size b/test/other/codesize/test_codesize_files_wasmfs.size
index 82b16397a9..20191f896a 100644
--- a/test/other/codesize/test_codesize_files_wasmfs.size
+++ b/test/other/codesize/test_codesize_files_wasmfs.size
@@ -1 +1 @@
-50314
+50233
diff --git a/test/other/codesize/test_codesize_hello_O3.size b/test/other/codesize/test_codesize_hello_O3.size
index b0539e90d9..b339887848 100644
--- a/test/other/codesize/test_codesize_hello_O3.size
+++ b/test/other/codesize/test_codesize_hello_O3.size
@@ -1 +1 @@
-1735
+1733
diff --git a/test/other/codesize/test_codesize_hello_Os.size b/test/other/codesize/test_codesize_hello_Os.size
index 1c38c9071a..9b5f360cc2 100644
--- a/test/other/codesize/test_codesize_hello_Os.size
+++ b/test/other/codesize/test_codesize_hello_Os.size
@@ -1 +1 @@
-1725
+1723
diff --git a/test/other/codesize/test_codesize_hello_Oz.size b/test/other/codesize/test_codesize_hello_Oz.size
index 771034cb6a..6bbc2a3cd4 100644
--- a/test/other/codesize/test_codesize_hello_Oz.size
+++ b/test/other/codesize/test_codesize_hello_Oz.size
@@ -1 +1 @@
-1259
+1257
diff --git a/test/other/codesize/test_codesize_hello_single_file.jssize b/test/other/codesize/test_codesize_hello_single_file.jssize
index 4cd877762a..8755c7be20 100644
--- a/test/other/codesize/test_codesize_hello_single_file.jssize
+++ b/test/other/codesize/test_codesize_hello_single_file.jssize
@@ -1 +1 @@
-6615
+6611
diff --git a/test/other/codesize/test_codesize_hello_wasmfs.size b/test/other/codesize/test_codesize_hello_wasmfs.size
index b0539e90d9..b339887848 100644
--- a/test/other/codesize/test_codesize_hello_wasmfs.size
+++ b/test/other/codesize/test_codesize_hello_wasmfs.size
@@ -1 +1 @@
-1735
+1733
diff --git a/test/other/codesize/test_codesize_libcxxabi_message_O3_standalone.funcs b/test/other/codesize/test_codesize_libcxxabi_message_O3_standalone.funcs
index 86fd2dc144..7f12daaeba 100644
--- a/test/other/codesize/test_codesize_libcxxabi_message_O3_standalone.funcs
+++ b/test/other/codesize/test_codesize_libcxxabi_message_O3_standalone.funcs
@@ -1,2 +1 @@
-$__wasm_call_ctors
 $_start
diff --git a/test/other/codesize/test_codesize_libcxxabi_message_O3_standalone.size b/test/other/codesize/test_codesize_libcxxabi_message_O3_standalone.size
index 7296f257eb..94361d49fd 100644
--- a/test/other/codesize/test_codesize_libcxxabi_message_O3_standalone.size
+++ b/test/other/codesize/test_codesize_libcxxabi_message_O3_standalone.size
@@ -1 +1 @@
-136
+132
diff --git a/test/other/codesize/test_codesize_mem_O3_grow_standalone.funcs b/test/other/codesize/test_codesize_mem_O3_grow_standalone.funcs
index 8a606d1279..19dd45693e 100644
--- a/test/other/codesize/test_codesize_mem_O3_grow_standalone.funcs
+++ b/test/other/codesize/test_codesize_mem_O3_grow_standalone.funcs
@@ -1,3 +1,2 @@
-$__wasm_call_ctors
 $_start
 $sbrk
diff --git a/test/other/codesize/test_codesize_mem_O3_grow_standalone.size b/test/other/codesize/test_codesize_mem_O3_grow_standalone.size
index ab5b9efed7..848ef7c501 100644
--- a/test/other/codesize/test_codesize_mem_O3_grow_standalone.size
+++ b/test/other/codesize/test_codesize_mem_O3_grow_standalone.size
@@ -1 +1 @@
-5553
+5549
diff --git a/test/other/codesize/test_codesize_mem_O3_standalone.funcs b/test/other/codesize/test_codesize_mem_O3_standalone.funcs
index 8a606d1279..19dd45693e 100644
--- a/test/other/codesize/test_codesize_mem_O3_standalone.funcs
+++ b/test/other/codesize/test_codesize_mem_O3_standalone.funcs
@@ -1,3 +1,2 @@
-$__wasm_call_ctors
 $_start
 $sbrk
diff --git a/test/other/codesize/test_codesize_mem_O3_standalone.size b/test/other/codesize/test_codesize_mem_O3_standalone.size
index 7bcda5ba23..7e9732ae43 100644
--- a/test/other/codesize/test_codesize_mem_O3_standalone.size
+++ b/test/other/codesize/test_codesize_mem_O3_standalone.size
@@ -1 +1 @@
-5478
+5474
diff --git a/test/other/codesize/test_codesize_mem_O3_standalone_narg.funcs b/test/other/codesize/test_codesize_mem_O3_standalone_narg.funcs
index 8a606d1279..19dd45693e 100644
--- a/test/other/codesize/test_codesize_mem_O3_standalone_narg.funcs
+++ b/test/other/codesize/test_codesize_mem_O3_standalone_narg.funcs
@@ -1,3 +1,2 @@
-$__wasm_call_ctors
 $_start
 $sbrk
diff --git a/test/other/codesize/test_codesize_mem_O3_standalone_narg.size b/test/other/codesize/test_codesize_mem_O3_standalone_narg.size
index 05112f24d5..b54c900141 100644
--- a/test/other/codesize/test_codesize_mem_O3_standalone_narg.size
+++ b/test/other/codesize/test_codesize_mem_O3_standalone_narg.size
@@ -1 +1 @@
-5271
+5267
diff --git a/test/other/codesize/test_codesize_mem_O3_standalone_narg_flto.funcs b/test/other/codesize/test_codesize_mem_O3_standalone_narg_flto.funcs
index 8a606d1279..19dd45693e 100644
--- a/test/other/codesize/test_codesize_mem_O3_standalone_narg_flto.funcs
+++ b/test/other/codesize/test_codesize_mem_O3_standalone_narg_flto.funcs
@@ -1,3 +1,2 @@
-$__wasm_call_ctors
 $_start
 $sbrk
diff --git a/test/other/codesize/test_codesize_mem_O3_standalone_narg_flto.size b/test/other/codesize/test_codesize_mem_O3_standalone_narg_flto.size
index 603c2df295..bbdd8cef02 100644
--- a/test/other/codesize/test_codesize_mem_O3_standalone_narg_flto.size
+++ b/test/other/codesize/test_codesize_mem_O3_standalone_narg_flto.size
@@ -1 +1 @@
-4084
+4080

One lucky embind test shrinks by 20%, but all other changes are just
a few bytes, far less than 1%. I looked at real-world codebases, and
see no real benefit there. My hunch is that this is expected because of
signature overlap: when you generate random graphs of size n and
chance for each edge to exist p, then even if p decreases to 0 the
graph will tend to end up fully connected [1]. And, in wasm, p
does not even decrease to 0:

Consider some common signature like {i32} -> {} (i32 param, no result).
In real-world code, there is some chance q>0 for that signature to be called,
and some chance r>0 for that signature to exist in the code.
p >= O(rq) > 0 because all it takes for a connection to exist is that that
signature exists on one side and is called on the other.

That is, in large codebases there is an overlap in signatures, and
statistically this means that all the code will end up reachable, at
least in the limit. In small programs you may get lucky, but not in
the long run. And even in the mid run, you will quickly see weird
stuff like a game engine's physics code seeming to be able to call
networking or audio (impossible in general, but they can overlap
on signatures).

To really fix that we need more than structural typing of indirect
calls, something like knowing the possible targets at each callsite.
Devirtualization can provide this, based on source language info.
Still, this PR may be of some benefit in some cases.

[1] https://en.wikipedia.org/wiki/Erd%C5%91s%E2%80%93R%C3%A9nyi_model#Properties_of_G(n,_p)

aheejin

What's the difference between note / use / reference?

I think I understand what use and reference are... use means we use it so we have to preserve it, while reference means we may or may not use it but its name is referenced somewhere so we at least have to keep its shell (even if we empty out the contents). Not sure what note is... Can note possibly be merged with use or reference?

test/lit/passes/remove-unused-module-elements_ci-types.wast

aheejin · 2025-07-21T20:18:31Z

test/lit/passes/remove-unused-module-elements_ci-types.wast

+ ;; CHECK:      (type $B (sub (func (param f64))))
+ (type $B (sub (func (param f64))))


Any reason we don't have (type $C spelled out as well?

Hmm, it validates without it (it is defined implicitly), so I didn't think there was a need? Also I guess it adds coverage for implicitly-defined types.

aheejin · 2025-07-21T22:21:35Z

test/wasm2js/br_table_temp.2asm.js

@@ -12768,13 +12768,6 @@ function asmFunc(imports) {
  return $1_1 | 0;
 }

- function f($0_1, $1_1, $2_1) {


Is this change for something else?

No, this is an optimization unlocked by this PR. It is defined here:

binaryen/test/wasm2js/br_table_temp.wast

Line 971 in 76ab43f

(func $f (param i32 i32 i32) (result i32) (i32.const -1))

It has a few direct calls, but they get optimized out in this optimized test output. Previously, I guess it remained alive because of this table reference, when we can now see has no corresponding call_indirect,

binaryen/test/wasm2js/br_table_temp.wast

Line 995 in 76ab43f

(table funcref (elem $f))

src/passes/RemoveUnusedModuleElements.cpp

kripken · 2025-07-21T22:58:08Z

What's the difference between note / use / reference?

"Use" means to use something, like call a function, so we must include it fully in the output.
"Reference" means to refer to something without using it, like we may end up referring to a function from a table even if we know there is no call_indirect to it. We need to define the function for validation purposes, but it will not execute.
"Note" is one of several places we need to note something specific. This happens in the very early phase where we scan the module initially. For example,noteCallRef notes a call_ref, which might lead to uses and references, but for now we just note that a call_ref exists in the initial scan. The actual processing happens later. Each note* method has special handling for some particular case. So there is no note() like there is use(), reference(), because each of the note*() methods is special.

Co-authored-by: Heejin Ahn <[email protected]>

aheejin · 2025-07-21T23:04:39Z

The only thing those note*** methods seem to do is to add them to some sets:

binaryen/src/passes/RemoveUnusedModuleElements.cpp

Lines 83 to 90 in bb7b7a1

    
           void noteCallRef(HeapType type) { callRefTypes.push_back(type); } 
        
           void noteRefFunc(Name refFunc) { refFuncs.push_back(refFunc); } 
        
           void noteStructField(StructField structField) { 
        
             structFields.push_back(structField); 
        
           } 
        
           void noteIndirectCall(Name table, HeapType type) { 
        
             indirectCalls.push_back({table, type}); 
        
           }

And here in processExpressions seems the only place those sets are directly used:

binaryen/src/passes/RemoveUnusedModuleElements.cpp

Lines 286 to 297 in bb7b7a1

    
           for (auto type : finder.callRefTypes) { 
        
             useCallRefType(type); 
        
           } 
        
           for (auto func : finder.refFuncs) { 
        
             useRefFunc(func); 
        
           } 
        
           for (auto structField : finder.structFields) { 
        
             useStructField(structField); 
        
           } 
        
           for (auto call : finder.indirectCalls) { 
        
             useIndirectCall(call); 
        
           }

Then can't we just replace those note***s with use***?

kripken · 2025-07-21T23:08:57Z

Hmm, good point, but I'd actually prefer to rename the latter, so processExpressions calls processCallIndirect etc. The reason is that useCallIndirect is a little odd - it isn't a module element that we can use or refer to. What do you think?

aheejin · 2025-07-21T23:10:25Z

process sounds fine to me. I was mostly wondering whether we can merge the two.

kripken · 2025-07-21T23:23:52Z

Makes sense. I agree it's good to try to merge where possible. Here, I think it is clearer to separate module elements which can be used and referenced, from specific things that need special processing. I added a few comments for that now, and renamed to process*.

aheejin · 2025-07-21T23:39:29Z

Does the PR now currently contain the the merging? It doesn't seem to, so.. By the way I think this looks good and don't want to hold up the landing for this. That can be done later as a follow-up (or not).

kripken · 2025-07-21T23:58:05Z

Sorry, I am saying that I don't think we can merge in this case. The terms are use(), reference() for module elements, things the pass can remove like Functions. And noteX(), processX() for specific things X (not module elements, like a particular heap type used in call_indirects) that need special processing. Those things are different:

The pass doesn't handle removing them, though they might get removed as internal parts of other things. We don't directly track references or uses of them.
And those things are a little abstracted - we consider functions that have references taken of them, not specific RefFuncs, for example; and not specific call_indirects but which heap types are called in that manner (and on which tables).

So I would prefer not to apply the terms use/reference to non-module elements.

aheejin · 2025-07-22T01:03:17Z

What I asked was, like, for example, we currently have this:

binaryen/src/passes/RemoveUnusedModuleElements.cpp

Lines 154 to 164 in 7489b27

    
           void visitCallIndirect(CallIndirect* curr) { 
        
             // We refer to the table, but may not use all parts of it, that depends on 
        
             // the heap type we call with. 
        
             reference({ModuleElementKind::Table, curr->table}); 
        
             noteIndirectCall(curr->table, curr->heapType); 
        
             // Note a possible call of a function reference as well, as something might 
        
             // be written into the table during runtime. With precise tracking of what 
        
             // is written into the table we could do better here; we could also see 
        
             // which tables are immutable. TODO 
        
             noteCallRef(curr->heapType); 
        
           }

I was wondering if we can just do this. So "merging" was maybe not the correct term after all...
(Comments are omitted)

  void visitCallIndirect(CallIndirect* curr) {
    reference({ModuleElementKind::Table, curr->table});
    useIndirectCall({curr->table, curr->heapType});
    useRefFunc(curr->heaptype); 
  }

because this is what we do anyway in processExpressions. We already have useIndirectCall, so I'm not sure what you mean by indirect calls are not something we use.

But while note was confusing to me, that can be a better abstraction. Anyway, I don't have a strong opinion, so feel free to land it.

kripken · 2025-07-22T16:01:47Z

@aheejin Ok, fair enough, maybe I was making things more complicated by trying to use those terms in different ways. I merged things now, avoiding note*() and process*() for items, using use*() instead, which is more uniform, and I guess those are basically uses of those things, even though the pass doesn't DCE them.

aheejin · 2025-07-22T22:02:41Z

@aheejin Ok, fair enough, maybe I was making things more complicated by trying to use those terms in different ways. I merged things now, avoiding note*() and process*() for items, using use*() instead, which is more uniform, and I guess those are basically uses of those things, even though the pass doesn't DCE them.

I mean, what I asked was whether we were able to remove those previous note functions and call the existing use functions directly, not renaming existing notes to uses. I'm not asking to change code again, but I feel I need to clarify.

kripken · 2025-07-22T23:00:16Z

Oh, sorry, I guess I misread you then.

Maybe we can merge this actually. I thought the idea was to eagerly scan in parallel, then combine on a single thread, but reading the code now, it looks like it works entirely on a single-thread, but lazily. That is, the separation between the ReferenceFinder and Analyzer classes may not be needed (that separation is why atm we can't just directly apply uses from ReferenceFinder - we just collect them there, then Analyzer reads and uses those).

I'm not sure if it would be faster the other way, but it might be worth checking.

…7748) wasm-metadce does a graph analysis to find unreached things, and then cleans up using RemoveUnusedModuleElements. That pass become more powerful in #7728, which led to a situation where an import was removed from the wasm, but wasm-metadce did not report that it had removed it. This led to unneeded code in the JS (it kept sending that import, unnecessarily). This was a harmless minor waste of JS size, but it did cause a test error on Emscripten (#7747), as it parses that JS to check some things, and it found an import in JS without a use in wasm. To fix that, check if that pass removed imports, and report them.

#24783) Followup to https://github.com/emscripten-core/emscripten/pull/24777/files#diff-59f6b346090f49748193fef565b0c7e223f93ecef5bb116fa370d0b864af9493 after WebAssembly/binaryen#7748 landed. Code size savings are from WebAssembly/binaryen#7728

kripken added 30 commits July 15, 2025 11:07

start

18d95f4

undo

5f2cfb8

test

be2e8fe

test

4153a08

test

e0d75af

test

c5489b6

test

d9c01ef

start

1f8514f

work

c589ae4

why

8a60593

work

89714e3

work

7dbf367

clean

c904c90

fix

19ad733

work

390b6d3

work

2fbdb11

work

5aba444

work

d5503ad

work

47ed63f

work

8ec2d3e

fix

5bee346

fix

7d4cb5a

fix

4a2a95e

fix

b14ac74

fix

3afce5d

fix

baf760e

fix

6491fcf

fix

2588372

compiles

f35f4b9

fix

f49b2f6

tlively approved these changes Jul 21, 2025

View reviewed changes

aheejin reviewed Jul 21, 2025

View reviewed changes

kripken and others added 2 commits July 21, 2025 15:59

Apply suggestions from code review

bb7b7a1

Co-authored-by: Heejin Ahn <[email protected]>

Update src/passes/RemoveUnusedModuleElements.cpp

4939877

Co-authored-by: Heejin Ahn <[email protected]>

kripken added 4 commits July 21, 2025 16:18

rename useIndirectCall/etc. to processIndirectcall/etc.

4659932

format

2adeb9c

add some comments

4b99fae

Merge remote-tracking branch 'myself/rume.ci.type' into rume.ci.type

7489b27

aheejin approved these changes Jul 21, 2025

View reviewed changes

merge/simplify terms

88b6383

kripken merged commit dd473d4 into WebAssembly:main Jul 22, 2025
16 checks passed

kripken deleted the rume.ci.type branch July 22, 2025 16:44

This was referenced Jul 23, 2025

Parallelize RemoveUnusedModuleElements function scanning [DO NOT LAND] #7745

Closed

[NFC] Merge the two parts of RemoveUnusedModuleElements #7746

Closed

sbc100 mentioned this pull request Jul 23, 2025

Recent Track indirect call types in RemoveUnusedModuleElements change broken emscripten test #7747

Closed

kripken mentioned this pull request Jul 23, 2025

[Metadce] Report removed imports due to RemoveUnusedModuleElements #7748

Merged

kripken mentioned this pull request Jul 24, 2025

Restore test_other.other.test_codesize_cxx_lto after Binaryen fix. NFC emscripten-core/emscripten#24783

Merged

		;; CHECK: (type $B (sub (func (param f64))))
		(type $B (sub (func (param f64))))

Track indirect call types in RemoveUnusedModuleElements #7728

Track indirect call types in RemoveUnusedModuleElements #7728

Conversation

kripken commented Jul 15, 2025

Uh oh!

aheejin left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

aheejin Jul 21, 2025

Choose a reason for hiding this comment

Uh oh!

kripken Jul 21, 2025

Choose a reason for hiding this comment

Uh oh!

aheejin Jul 21, 2025

Choose a reason for hiding this comment

Uh oh!

kripken Jul 21, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

kripken commented Jul 21, 2025

Uh oh!

aheejin commented Jul 21, 2025

Uh oh!

kripken commented Jul 21, 2025

Uh oh!

aheejin commented Jul 21, 2025

Uh oh!

kripken commented Jul 21, 2025

Uh oh!

aheejin commented Jul 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kripken commented Jul 21, 2025

Uh oh!

aheejin commented Jul 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kripken commented Jul 22, 2025

Uh oh!

Uh oh!

aheejin commented Jul 22, 2025

Uh oh!

kripken commented Jul 22, 2025

Uh oh!

Uh oh!

aheejin commented Jul 21, 2025 •

edited

Loading

aheejin commented Jul 22, 2025 •

edited

Loading