Skip to content

Track indirect call types in RemoveUnusedModuleElements #7728

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 66 commits into from
Jul 22, 2025

Conversation

kripken
Copy link
Member

@kripken kripken commented Jul 15, 2025

An indirect call to a type in a table now only forces functions of
that type to be marked as used: functions of other types are
left alone, potentially leaving them unreached.

This is more precise than assuming any indirect call can
reach anywhere, which is more or less what we did before.

There is a downside to this: the pass is around 10% slower. This is one
of our faster passes, so this may be acceptable, however.

This has some benefits, here is the Emscripten diff:

Example diff --git a/test/code_size/embind_val_wasm.json b/test/code_size/embind_val_wasm.json index 939ad737b3..1c8b8a1546 100644 --- a/test/code_size/embind_val_wasm.json +++ b/test/code_size/embind_val_wasm.json @@ -1,10 +1,10 @@ { "a.html": 552, "a.html.gz": 373, "a.js": 5356, "a.js.gz": 2526, - "a.wasm": 7468, - "a.wasm.gz": 3461, - "total": 13376, - "total_gz": 6360 + "a.wasm": 5831, + "a.wasm.gz": 2713, + "total": 11739, + "total_gz": 5612 } diff --git a/test/code_size/random_printf_wasm.json b/test/code_size/random_printf_wasm.json index 89da22d7c8..9685b59d93 100644 --- a/test/code_size/random_printf_wasm.json +++ b/test/code_size/random_printf_wasm.json @@ -1,6 +1,6 @@ { - "a.html": 12511, - "a.html.gz": 6848, - "total": 12511, - "total_gz": 6848 + "a.html": 12507, + "a.html.gz": 6822, + "total": 12507, + "total_gz": 6822 } diff --git a/test/code_size/random_printf_wasm2js.json b/test/code_size/random_printf_wasm2js.json index 5b21705c95..7d168dbd6a 100644 --- a/test/code_size/random_printf_wasm2js.json +++ b/test/code_size/random_printf_wasm2js.json @@ -1,6 +1,6 @@ { - "a.html": 17224, - "a.html.gz": 7551, - "total": 17224, - "total_gz": 7551 + "a.html": 17229, + "a.html.gz": 7542, + "total": 17229, + "total_gz": 7542 } diff --git a/test/other/codesize/test_codesize_files_wasmfs.size b/test/other/codesize/test_codesize_files_wasmfs.size index 82b16397a9..20191f896a 100644 --- a/test/other/codesize/test_codesize_files_wasmfs.size +++ b/test/other/codesize/test_codesize_files_wasmfs.size @@ -1 +1 @@ -50314 +50233 diff --git a/test/other/codesize/test_codesize_hello_O3.size b/test/other/codesize/test_codesize_hello_O3.size index b0539e90d9..b339887848 100644 --- a/test/other/codesize/test_codesize_hello_O3.size +++ b/test/other/codesize/test_codesize_hello_O3.size @@ -1 +1 @@ -1735 +1733 diff --git a/test/other/codesize/test_codesize_hello_Os.size b/test/other/codesize/test_codesize_hello_Os.size index 1c38c9071a..9b5f360cc2 100644 --- a/test/other/codesize/test_codesize_hello_Os.size +++ b/test/other/codesize/test_codesize_hello_Os.size @@ -1 +1 @@ -1725 +1723 diff --git a/test/other/codesize/test_codesize_hello_Oz.size b/test/other/codesize/test_codesize_hello_Oz.size index 771034cb6a..6bbc2a3cd4 100644 --- a/test/other/codesize/test_codesize_hello_Oz.size +++ b/test/other/codesize/test_codesize_hello_Oz.size @@ -1 +1 @@ -1259 +1257 diff --git a/test/other/codesize/test_codesize_hello_single_file.jssize b/test/other/codesize/test_codesize_hello_single_file.jssize index 4cd877762a..8755c7be20 100644 --- a/test/other/codesize/test_codesize_hello_single_file.jssize +++ b/test/other/codesize/test_codesize_hello_single_file.jssize @@ -1 +1 @@ -6615 +6611 diff --git a/test/other/codesize/test_codesize_hello_wasmfs.size b/test/other/codesize/test_codesize_hello_wasmfs.size index b0539e90d9..b339887848 100644 --- a/test/other/codesize/test_codesize_hello_wasmfs.size +++ b/test/other/codesize/test_codesize_hello_wasmfs.size @@ -1 +1 @@ -1735 +1733 diff --git a/test/other/codesize/test_codesize_libcxxabi_message_O3_standalone.funcs b/test/other/codesize/test_codesize_libcxxabi_message_O3_standalone.funcs index 86fd2dc144..7f12daaeba 100644 --- a/test/other/codesize/test_codesize_libcxxabi_message_O3_standalone.funcs +++ b/test/other/codesize/test_codesize_libcxxabi_message_O3_standalone.funcs @@ -1,2 +1 @@ -$__wasm_call_ctors $_start diff --git a/test/other/codesize/test_codesize_libcxxabi_message_O3_standalone.size b/test/other/codesize/test_codesize_libcxxabi_message_O3_standalone.size index 7296f257eb..94361d49fd 100644 --- a/test/other/codesize/test_codesize_libcxxabi_message_O3_standalone.size +++ b/test/other/codesize/test_codesize_libcxxabi_message_O3_standalone.size @@ -1 +1 @@ -136 +132 diff --git a/test/other/codesize/test_codesize_mem_O3_grow_standalone.funcs b/test/other/codesize/test_codesize_mem_O3_grow_standalone.funcs index 8a606d1279..19dd45693e 100644 --- a/test/other/codesize/test_codesize_mem_O3_grow_standalone.funcs +++ b/test/other/codesize/test_codesize_mem_O3_grow_standalone.funcs @@ -1,3 +1,2 @@ -$__wasm_call_ctors $_start $sbrk diff --git a/test/other/codesize/test_codesize_mem_O3_grow_standalone.size b/test/other/codesize/test_codesize_mem_O3_grow_standalone.size index ab5b9efed7..848ef7c501 100644 --- a/test/other/codesize/test_codesize_mem_O3_grow_standalone.size +++ b/test/other/codesize/test_codesize_mem_O3_grow_standalone.size @@ -1 +1 @@ -5553 +5549 diff --git a/test/other/codesize/test_codesize_mem_O3_standalone.funcs b/test/other/codesize/test_codesize_mem_O3_standalone.funcs index 8a606d1279..19dd45693e 100644 --- a/test/other/codesize/test_codesize_mem_O3_standalone.funcs +++ b/test/other/codesize/test_codesize_mem_O3_standalone.funcs @@ -1,3 +1,2 @@ -$__wasm_call_ctors $_start $sbrk diff --git a/test/other/codesize/test_codesize_mem_O3_standalone.size b/test/other/codesize/test_codesize_mem_O3_standalone.size index 7bcda5ba23..7e9732ae43 100644 --- a/test/other/codesize/test_codesize_mem_O3_standalone.size +++ b/test/other/codesize/test_codesize_mem_O3_standalone.size @@ -1 +1 @@ -5478 +5474 diff --git a/test/other/codesize/test_codesize_mem_O3_standalone_narg.funcs b/test/other/codesize/test_codesize_mem_O3_standalone_narg.funcs index 8a606d1279..19dd45693e 100644 --- a/test/other/codesize/test_codesize_mem_O3_standalone_narg.funcs +++ b/test/other/codesize/test_codesize_mem_O3_standalone_narg.funcs @@ -1,3 +1,2 @@ -$__wasm_call_ctors $_start $sbrk diff --git a/test/other/codesize/test_codesize_mem_O3_standalone_narg.size b/test/other/codesize/test_codesize_mem_O3_standalone_narg.size index 05112f24d5..b54c900141 100644 --- a/test/other/codesize/test_codesize_mem_O3_standalone_narg.size +++ b/test/other/codesize/test_codesize_mem_O3_standalone_narg.size @@ -1 +1 @@ -5271 +5267 diff --git a/test/other/codesize/test_codesize_mem_O3_standalone_narg_flto.funcs b/test/other/codesize/test_codesize_mem_O3_standalone_narg_flto.funcs index 8a606d1279..19dd45693e 100644 --- a/test/other/codesize/test_codesize_mem_O3_standalone_narg_flto.funcs +++ b/test/other/codesize/test_codesize_mem_O3_standalone_narg_flto.funcs @@ -1,3 +1,2 @@ -$__wasm_call_ctors $_start $sbrk diff --git a/test/other/codesize/test_codesize_mem_O3_standalone_narg_flto.size b/test/other/codesize/test_codesize_mem_O3_standalone_narg_flto.size index 603c2df295..bbdd8cef02 100644 --- a/test/other/codesize/test_codesize_mem_O3_standalone_narg_flto.size +++ b/test/other/codesize/test_codesize_mem_O3_standalone_narg_flto.size @@ -1 +1 @@ -4084 +4080

One lucky embind test shrinks by 20%, but all other changes are just
a few bytes, far less than 1%. I looked at real-world codebases, and
see no real benefit there. My hunch is that this is expected because of
signature overlap: when you generate random graphs of size n and
chance for each edge to exist p, then even if p decreases to 0 the
graph will tend to end up fully connected [1]. And, in wasm, p
does not even decrease to 0:

  • Consider some common signature like {i32} -> {} (i32 param, no result).
  • In real-world code, there is some chance q>0 for that signature to be called,
    and some chance r>0 for that signature to exist in the code.
  • p >= O(rq) > 0 because all it takes for a connection to exist is that that
    signature exists on one side and is called on the other.

That is, in large codebases there is an overlap in signatures, and
statistically this means that all the code will end up reachable, at
least in the limit. In small programs you may get lucky, but not in
the long run. And even in the mid run, you will quickly see weird
stuff like a game engine's physics code seeming to be able to call
networking or audio (impossible in general, but they can overlap
on signatures).

To really fix that we need more than structural typing of indirect
calls, something like knowing the possible targets at each callsite.
Devirtualization can provide this, based on source language info.
Still, this PR may be of some benefit in some cases.

[1] https://en.wikipedia.org/wiki/Erd%C5%91s%E2%80%93R%C3%A9nyi_model#Properties_of_G(n,_p)

Copy link
Member

@aheejin aheejin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the difference between note / use / reference?

I think I understand what use and reference are... use means we use it so we have to preserve it, while reference means we may or may not use it but its name is referenced somewhere so we at least have to keep its shell (even if we empty out the contents). Not sure what note is... Can note possibly be merged with use or reference?

Comment on lines +22 to +23
;; CHECK: (type $B (sub (func (param f64))))
(type $B (sub (func (param f64))))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason we don't have (type $C spelled out as well?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, it validates without it (it is defined implicitly), so I didn't think there was a need? Also I guess it adds coverage for implicitly-defined types.

@@ -12768,13 +12768,6 @@ function asmFunc(imports) {
return $1_1 | 0;
}

function f($0_1, $1_1, $2_1) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this change for something else?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, this is an optimization unlocked by this PR. It is defined here:

(func $f (param i32 i32 i32) (result i32) (i32.const -1))

It has a few direct calls, but they get optimized out in this optimized test output. Previously, I guess it remained alive because of this table reference, when we can now see has no corresponding call_indirect,

(table funcref (elem $f))

@kripken
Copy link
Member Author

kripken commented Jul 21, 2025

What's the difference between note / use / reference?

  • "Use" means to use something, like call a function, so we must include it fully in the output.
  • "Reference" means to refer to something without using it, like we may end up referring to a function from a table even if we know there is no call_indirect to it. We need to define the function for validation purposes, but it will not execute.
  • "Note" is one of several places we need to note something specific. This happens in the very early phase where we scan the module initially. For example,noteCallRef notes a call_ref, which might lead to uses and references, but for now we just note that a call_ref exists in the initial scan. The actual processing happens later. Each note* method has special handling for some particular case. So there is no note() like there is use(), reference(), because each of the note*() methods is special.

@aheejin
Copy link
Member

aheejin commented Jul 21, 2025

The only thing those note*** methods seem to do is to add them to some sets:

void noteCallRef(HeapType type) { callRefTypes.push_back(type); }
void noteRefFunc(Name refFunc) { refFuncs.push_back(refFunc); }
void noteStructField(StructField structField) {
structFields.push_back(structField);
}
void noteIndirectCall(Name table, HeapType type) {
indirectCalls.push_back({table, type});
}

And here in processExpressions seems the only place those sets are directly used:

for (auto type : finder.callRefTypes) {
useCallRefType(type);
}
for (auto func : finder.refFuncs) {
useRefFunc(func);
}
for (auto structField : finder.structFields) {
useStructField(structField);
}
for (auto call : finder.indirectCalls) {
useIndirectCall(call);
}

Then can't we just replace those note***s with use***?

@kripken
Copy link
Member Author

kripken commented Jul 21, 2025

Hmm, good point, but I'd actually prefer to rename the latter, so processExpressions calls processCallIndirect etc. The reason is that useCallIndirect is a little odd - it isn't a module element that we can use or refer to. What do you think?

@aheejin
Copy link
Member

aheejin commented Jul 21, 2025

process sounds fine to me. I was mostly wondering whether we can merge the two.

@kripken
Copy link
Member Author

kripken commented Jul 21, 2025

Makes sense. I agree it's good to try to merge where possible. Here, I think it is clearer to separate module elements which can be used and referenced, from specific things that need special processing. I added a few comments for that now, and renamed to process*.

@aheejin
Copy link
Member

aheejin commented Jul 21, 2025

Does the PR now currently contain the the merging? It doesn't seem to, so.. By the way I think this looks good and don't want to hold up the landing for this. That can be done later as a follow-up (or not).

@kripken
Copy link
Member Author

kripken commented Jul 21, 2025

Sorry, I am saying that I don't think we can merge in this case. The terms are use(), reference() for module elements, things the pass can remove like Functions. And noteX(), processX() for specific things X (not module elements, like a particular heap type used in call_indirects) that need special processing. Those things are different:

  • The pass doesn't handle removing them, though they might get removed as internal parts of other things. We don't directly track references or uses of them.
  • And those things are a little abstracted - we consider functions that have references taken of them, not specific RefFuncs, for example; and not specific call_indirects but which heap types are called in that manner (and on which tables).

So I would prefer not to apply the terms use/reference to non-module elements.

@aheejin
Copy link
Member

aheejin commented Jul 22, 2025

What I asked was, like, for example, we currently have this:

void visitCallIndirect(CallIndirect* curr) {
// We refer to the table, but may not use all parts of it, that depends on
// the heap type we call with.
reference({ModuleElementKind::Table, curr->table});
noteIndirectCall(curr->table, curr->heapType);
// Note a possible call of a function reference as well, as something might
// be written into the table during runtime. With precise tracking of what
// is written into the table we could do better here; we could also see
// which tables are immutable. TODO
noteCallRef(curr->heapType);
}

I was wondering if we can just do this. So "merging" was maybe not the correct term after all...
(Comments are omitted)

  void visitCallIndirect(CallIndirect* curr) {
    reference({ModuleElementKind::Table, curr->table});
    useIndirectCall({curr->table, curr->heapType});
    useRefFunc(curr->heaptype); 
  }

because this is what we do anyway in processExpressions. We already have useIndirectCall, so I'm not sure what you mean by indirect calls are not something we use.

But while note was confusing to me, that can be a better abstraction. Anyway, I don't have a strong opinion, so feel free to land it.

@kripken
Copy link
Member Author

kripken commented Jul 22, 2025

@aheejin Ok, fair enough, maybe I was making things more complicated by trying to use those terms in different ways. I merged things now, avoiding note*() and process*() for items, using use*() instead, which is more uniform, and I guess those are basically uses of those things, even though the pass doesn't DCE them.

@kripken kripken merged commit dd473d4 into WebAssembly:main Jul 22, 2025
16 checks passed
@kripken kripken deleted the rume.ci.type branch July 22, 2025 16:44
@aheejin
Copy link
Member

aheejin commented Jul 22, 2025

@aheejin Ok, fair enough, maybe I was making things more complicated by trying to use those terms in different ways. I merged things now, avoiding note*() and process*() for items, using use*() instead, which is more uniform, and I guess those are basically uses of those things, even though the pass doesn't DCE them.

I mean, what I asked was whether we were able to remove those previous note functions and call the existing use functions directly, not renaming existing notes to uses. I'm not asking to change code again, but I feel I need to clarify.

@kripken
Copy link
Member Author

kripken commented Jul 22, 2025

Oh, sorry, I guess I misread you then.

Maybe we can merge this actually. I thought the idea was to eagerly scan in parallel, then combine on a single thread, but reading the code now, it looks like it works entirely on a single-thread, but lazily. That is, the separation between the ReferenceFinder and Analyzer classes may not be needed (that separation is why atm we can't just directly apply uses from ReferenceFinder - we just collect them there, then Analyzer reads and uses those).

I'm not sure if it would be faster the other way, but it might be worth checking.

kripken added a commit that referenced this pull request Jul 23, 2025
…7748)

wasm-metadce does a graph analysis to find unreached things, and then
cleans up using RemoveUnusedModuleElements. That pass become more
powerful in #7728, which led to a situation where an import was removed
from the wasm, but wasm-metadce did not report that it had removed it.
This led to unneeded code in the JS (it kept sending that import,
unnecessarily). This was a harmless minor waste of JS size, but it did cause
a test error on Emscripten (#7747), as it parses that JS to check some
things, and it found an import in JS without a use in wasm.

To fix that, check if that pass removed imports, and report them.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants