Skip to content

Track indirect call types in RemoveUnusedModuleElements #7728

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 66 commits into from
Jul 22, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
66 commits
Select commit Hold shift + click to select a range
18d95f4
start
kripken Jul 15, 2025
5f2cfb8
undo
kripken Jul 15, 2025
be2e8fe
test
kripken Jul 15, 2025
4153a08
test
kripken Jul 15, 2025
e0d75af
test
kripken Jul 15, 2025
c5489b6
test
kripken Jul 15, 2025
d9c01ef
test
kripken Jul 15, 2025
1f8514f
start
kripken Jul 15, 2025
c589ae4
work
kripken Jul 15, 2025
8a60593
why
kripken Jul 15, 2025
89714e3
work
kripken Jul 15, 2025
7dbf367
work
kripken Jul 15, 2025
c904c90
clean
kripken Jul 15, 2025
19ad733
fix
kripken Jul 15, 2025
390b6d3
work
kripken Jul 15, 2025
2fbdb11
work
kripken Jul 15, 2025
5aba444
work
kripken Jul 15, 2025
d5503ad
work
kripken Jul 15, 2025
47ed63f
work
kripken Jul 15, 2025
8ec2d3e
work
kripken Jul 15, 2025
5bee346
fix
kripken Jul 15, 2025
7d4cb5a
fix
kripken Jul 15, 2025
4a2a95e
fix
kripken Jul 15, 2025
b14ac74
fix
kripken Jul 15, 2025
3afce5d
fix
kripken Jul 15, 2025
baf760e
fix
kripken Jul 15, 2025
6491fcf
fix
kripken Jul 15, 2025
2588372
fix
kripken Jul 15, 2025
f35f4b9
compiles
kripken Jul 15, 2025
f49b2f6
fix
kripken Jul 15, 2025
e677093
form
kripken Jul 15, 2025
c87d50d
test
kripken Jul 15, 2025
f8758ef
test
kripken Jul 15, 2025
acf5751
test
kripken Jul 15, 2025
a5a21b4
test
kripken Jul 15, 2025
feba2ca
test
kripken Jul 15, 2025
a04a273
fix
kripken Jul 15, 2025
963dac9
work
kripken Jul 15, 2025
bcc9611
work
kripken Jul 15, 2025
3260e91
work
kripken Jul 15, 2025
8e93782
work
kripken Jul 15, 2025
5a4a83f
test
kripken Jul 15, 2025
2303c67
add table.init test
kripken Jul 16, 2025
962e62f
reverse
kripken Jul 16, 2025
7c0cbb9
testing
kripken Jul 16, 2025
509eadd
Merge remote-tracking branch 'origin/main' into rume.ci.type
kripken Jul 17, 2025
69b59cf
add a test with an imported table
kripken Jul 17, 2025
686afe0
clean up
kripken Jul 17, 2025
e5b28e4
update test
kripken Jul 17, 2025
197e7ce
update test
kripken Jul 17, 2025
00d82bf
update test
kripken Jul 17, 2025
54f7e1c
update test
kripken Jul 17, 2025
bf9d439
update test
kripken Jul 17, 2025
57aee14
update test
kripken Jul 17, 2025
ddc2394
clarify use/reference, and add table.get support
kripken Jul 17, 2025
f297cb0
format
kripken Jul 17, 2025
326fd4f
update test
kripken Jul 17, 2025
752d426
simplify and clarify
kripken Jul 17, 2025
76ab43f
fix lit test after we optimized 'too well'
kripken Jul 17, 2025
bb7b7a1
Apply suggestions from code review
kripken Jul 21, 2025
4939877
Update src/passes/RemoveUnusedModuleElements.cpp
kripken Jul 21, 2025
4659932
rename useIndirectCall/etc. to processIndirectcall/etc.
kripken Jul 21, 2025
2adeb9c
format
kripken Jul 21, 2025
4b99fae
add some comments
kripken Jul 21, 2025
7489b27
Merge remote-tracking branch 'myself/rume.ci.type' into rume.ci.type
kripken Jul 21, 2025
88b6383
merge/simplify terms
kripken Jul 22, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
170 changes: 116 additions & 54 deletions src/passes/RemoveUnusedModuleElements.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,8 @@
// * No references at all. We can simply remove it.
// * References, but no uses. We can't remove it, but we can change it (see
// below).
// * Uses (which imply references). We must keep it as it is.
// * Uses (which imply references). We must keep it as it is, because it is
// fully used (e.g. for a function, it is called and may execute).
//
// An example of something with a reference but *not* a use is a RefFunc to a
// function that has no corresponding CallRef to that type. We cannot just
Expand Down Expand Up @@ -62,25 +63,38 @@ using ModuleElementKind = ModuleItemKind;
// name of the particular element.
using ModuleElement = std::pair<ModuleElementKind, Name>;

// Information from an indirect call: the name of the table, and the heap type.
using IndirectCall = std::pair<Name, HeapType>;

// Visit or walk an expression to find what things are referenced.
struct ReferenceFinder
: public PostWalker<ReferenceFinder,
UnifiedExpressionVisitor<ReferenceFinder>> {
// Our findings are placed in these data structures, which the user of this
// code can then process.
std::vector<ModuleElement> elements;
// code can then process. We mark both uses and references, and also note
// uses of specific things that require special handling, like refFuncs.
std::vector<ModuleElement> used, referenced;
std::vector<HeapType> callRefTypes;
std::vector<Name> refFuncs;
std::vector<StructField> structFields;
std::vector<IndirectCall> indirectCalls;

// Add an item to the output data structures.
void note(ModuleElement element) { elements.push_back(element); }
void noteCallRef(HeapType type) { callRefTypes.push_back(type); }
void noteRefFunc(Name refFunc) { refFuncs.push_back(refFunc); }
void note(StructField structField) { structFields.push_back(structField); }

// Generic visitor
void use(ModuleElement element) { used.push_back(element); }
void reference(ModuleElement element) { referenced.push_back(element); }
void useCallRef(HeapType type) { callRefTypes.push_back(type); }
void useRefFunc(Name refFunc) { refFuncs.push_back(refFunc); }
void useStructField(StructField structField) {
structFields.push_back(structField);
}
void useIndirectCall(Name table, HeapType type) {
indirectCalls.push_back({table, type});
}

// Generic visitor: Use all the things referenced. This handles e.g. using the
// table of a table.get. When we do not want such unconditional use, we
// override (e.g. for call_indirect, we don't want to mark the entire table as
// used, see below).
void visitExpression(Expression* curr) {
#define DELEGATE_ID curr->_id

Expand All @@ -101,7 +115,7 @@ struct ReferenceFinder

#define DELEGATE_FIELD_NAME_KIND(id, field, kind) \
if (cast->field.is()) { \
note({kind, cast->field}); \
use({kind, cast->field}); \
}

#include "wasm-delegations-fields.def"
Expand All @@ -110,7 +124,7 @@ struct ReferenceFinder
// Specific visitors

void visitCall(Call* curr) {
note({ModuleElementKind::Function, curr->target});
use({ModuleElementKind::Function, curr->target});

if (Intrinsics(*getModule()).isCallWithoutEffects(curr)) {
// A call-without-effects receives a function reference and calls it, the
Expand All @@ -137,12 +151,15 @@ struct ReferenceFinder
}

void visitCallIndirect(CallIndirect* curr) {
note({ModuleElementKind::Table, curr->table});
// We refer to the table, but may not use all parts of it, that depends on
// the heap type we call with.
reference({ModuleElementKind::Table, curr->table});
useIndirectCall(curr->table, curr->heapType);
// Note a possible call of a function reference as well, as something might
// be written into the table during runtime. With precise tracking of what
// is written into the table we could do better here; we could also see
// which tables are immutable. TODO
noteCallRef(curr->heapType);
useCallRef(curr->heapType);
}

void visitCallRef(CallRef* curr) {
Expand All @@ -151,17 +168,17 @@ struct ReferenceFinder
return;
}

noteCallRef(curr->target->type.getHeapType());
useCallRef(curr->target->type.getHeapType());
}

void visitRefFunc(RefFunc* curr) { noteRefFunc(curr->func); }
void visitRefFunc(RefFunc* curr) { useRefFunc(curr->func); }

void visitStructGet(StructGet* curr) {
if (curr->ref->type == Type::unreachable || curr->ref->type.isNull()) {
return;
}
auto type = curr->ref->type.getHeapType();
note(StructField{type, curr->index});
useStructField(StructField{type, curr->index});
}
};

Expand Down Expand Up @@ -262,9 +279,12 @@ struct Analyzer {
ReferenceFinder finder;
finder.setModule(module);
finder.visit(curr);
for (auto element : finder.elements) {
for (auto element : finder.used) {
use(element);
}
for (auto element : finder.referenced) {
reference(element);
}
for (auto type : finder.callRefTypes) {
useCallRefType(type);
}
Expand All @@ -274,6 +294,9 @@ struct Analyzer {
for (auto structField : finder.structFields) {
useStructField(structField);
}
for (auto call : finder.indirectCalls) {
useIndirectCall(call);
}

// Scan the children to continue our work.
scanChildren(curr);
Expand Down Expand Up @@ -316,6 +339,37 @@ struct Analyzer {
}
}

std::unordered_set<IndirectCall> usedIndirectCalls;

void useIndirectCall(IndirectCall call) {
auto [_, inserted] = usedIndirectCalls.insert(call);
if (!inserted) {
return;
}

// TODO: use structured bindings with c++20, needed for the capture below
auto table = call.first;
auto type = call.second;

// Any function in the table of that signature may be called.
ModuleUtils::iterTableSegments(
*module, table, [&](ElementSegment* segment) {
auto segmentReferenced = false;
for (auto* item : segment->data) {
if (auto* refFunc = item->dynCast<RefFunc>()) {
auto* func = module->getFunction(refFunc->func);
if (HeapType::isSubType(func->type, type)) {
use({ModuleElementKind::Function, refFunc->func});
segmentReferenced = true;
}
}
}
if (segmentReferenced) {
reference({ModuleElementKind::ElementSegment, segment->name});
}
});
}

void useRefFunc(Name func) {
if (!options.closedWorld) {
// The world is open, so assume the worst and something (inside or outside
Expand All @@ -341,7 +395,7 @@ struct Analyzer {
// We've never seen a CallRef for this, but might see one later.
uncalledRefFuncMap[type].insert(func);

referenced.insert(element);
reference(element);
}
}

Expand Down Expand Up @@ -554,34 +608,11 @@ struct Analyzer {
finder.setModule(module);
finder.walk(curr);

for (auto element : finder.elements) {
// Avoid repeated work. Note that globals with multiple references to
// previous globals can lead to exponential work, so this is important.
// (If C refers twice to B, and B refers twice to A, then when we process
// C we would, naively, scan B twice and A four times.)
auto [_, inserted] = referenced.insert(element);
if (!inserted) {
continue;
}

auto& [kind, value] = element;
if (kind == ModuleElementKind::Global) {
// Like functions, (non-imported) globals have contents. For functions,
// things are simple: if a function ends up with references but no uses
// then we can simply empty out the function (by setting its body to an
// unreachable). We don't have a simple way to do the same for globals,
// unfortunately. For now, scan the global's contents and add references
// as needed.
// TODO: We could try to empty the global out, for example, replace it
// with a null if it is nullable, or replace all gets of it with
// something else, but that is not trivial.
auto* global = module->getGlobal(value);
if (!global->imported()) {
// Note that infinite recursion is not a danger here since a global
// can only refer to previous globals.
addReferences(global->init);
}
}
for (auto element : finder.used) {
reference(element);
}
for (auto element : finder.referenced) {
reference(element);
}

for (auto func : finder.refFuncs) {
Expand All @@ -594,7 +625,7 @@ struct Analyzer {
// just adding a reference to the function, and not actually using the
// RefFunc. (Only useRefFunc() + a CallRef of the proper type are enough
// to make a function itself used.)
referenced.insert({ModuleElementKind::Function, func});
reference({ModuleElementKind::Function, func});
}

// Note: nothing to do with |callRefTypes| and |structFields|, which only
Expand All @@ -603,6 +634,44 @@ struct Analyzer {
// handled in an entirely different way in Binaryen IR, and we don't need to
// worry about it.)
}

void reference(ModuleElement element) {
// Avoid repeated work. Note that globals with multiple references to
// previous globals can lead to exponential work, so this is important.
// (If C refers twice to B, and B refers twice to A, then when we process
// C we would, naively, scan B twice and A four times.)
auto [_, inserted] = referenced.insert(element);
if (!inserted) {
return;
}

// Some references force references to their internals, just by being
// referenced and present in the output.
auto& [kind, value] = element;
if (kind == ModuleElementKind::Global) {
// Like functions, (non-imported) globals have contents. For functions,
// things are simple: if a function ends up with references but no uses
// then we can simply empty out the function (by setting its body to an
// unreachable). We don't have a simple way to do the same for globals,
// unfortunately. For now, scan the global's contents and add references
// as needed.
// TODO: We could try to empty the global out, for example, replace it
// with a null if it is nullable, or replace all gets of it with
// something else, but that is not trivial.
auto* global = module->getGlobal(value);
if (!global->imported()) {
// Note that infinite recursion is not a danger here since a global
// can only refer to previous globals.
addReferences(global->init);
}
} else if (kind == ModuleElementKind::ElementSegment) {
// TODO: We could empty out parts of the segment we don't need.
auto* segment = module->getElementSegment(value);
for (auto* item : segment->data) {
addReferences(item);
}
}
}
};

struct RemoveUnusedModuleElements : public Pass {
Expand Down Expand Up @@ -709,13 +778,6 @@ struct RemoveUnusedModuleElements : public Pass {
}
});

// For now, all functions that can be called indirectly are marked as roots.
// TODO: Compute this based on which ElementSegments are actually used,
// and which functions have a call_indirect of the proper type.
ElementUtils::iterAllElementFunctionNames(module, [&](Name name) {
roots.emplace_back(ModuleElementKind::Function, name);
});

// Just as out-of-bound segments may cause observable traps at instantiation
// time, so can struct.new instructions with null descriptors cause traps in
// global or element segment initializers.
Expand Down
14 changes: 2 additions & 12 deletions test/ctor-eval/bad-indirect-call3.wast.out
Original file line number Diff line number Diff line change
@@ -1,19 +1,9 @@
(module
(type $0 (func (param externref)))
(type $1 (func))
(type $0 (func))
(type $funcref_=>_none (func (param funcref)))
(memory $0 256 256)
(data $0 (i32.const 10) "waka waka waka waka waka")
(table $0 1 1 funcref)
(elem $implicit-elem (i32.const 0) $callee)
(export "sig_mismatch" (func $sig_mismatch))
(func $callee (type $0) (param $0 externref)
(i32.store8
(i32.const 40)
(i32.const 67)
)
)
(func $sig_mismatch (type $1)
(func $sig_mismatch (type $0)
(call_indirect $0 (type $funcref_=>_none)
(ref.null nofunc)
(i32.const 0)
Expand Down
3 changes: 2 additions & 1 deletion test/ctor-eval/basics-flatten.wast
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
(module
(type $v (func))
(memory 256 256)
(memory $m 256 256)
;; test flattening of multiple segments
(data (i32.const 10) "waka ")
(data (i32.const 15) "waka") ;; skip a byte here
Expand All @@ -10,6 +10,7 @@
(export "test1" (func $test1))
(export "test2" (func $test2))
(export "test3" (func $test3))
(export "memory" (memory $m)) ;; export memory so we can see the flattened data
(func $test1
(drop (i32.const 0)) ;; no work at all, really
(call $safe-to-call) ;; safe to call
Expand Down
10 changes: 2 additions & 8 deletions test/ctor-eval/basics-flatten.wast.out
Original file line number Diff line number Diff line change
@@ -1,11 +1,5 @@
(module
(type $v (func))
(memory $0 256 256)
(memory $m 256 256)
(data $0 (i32.const 10) "nas\00\00\00aka\00yzkx waka wakm\00\00\00\00\00\00C")
(func $call-indirect (type $v)
(i32.store8
(i32.const 40)
(i32.const 67)
)
)
(export "memory" (memory $m))
)
3 changes: 2 additions & 1 deletion test/ctor-eval/basics.wast
Original file line number Diff line number Diff line change
@@ -1,12 +1,13 @@
(module
(type $v (func))
(memory 256 256)
(memory $m 256 256)
(data (i32.const 10) "waka waka waka waka waka")
(table 1 1 funcref)
(elem (i32.const 0) $call-indirect)
(export "test1" (func $test1))
(export "test2" (func $test2))
(export "test3" (func $test3))
(export "memory" (memory $m)) ;; export memory so we can see the updated data
(func $test1
(drop (i32.const 0)) ;; no work at all, really
(call $safe-to-call) ;; safe to call
Expand Down
10 changes: 2 additions & 8 deletions test/ctor-eval/basics.wast.out
Original file line number Diff line number Diff line change
@@ -1,11 +1,5 @@
(module
(type $v (func))
(memory $0 256 256)
(memory $m 256 256)
(data $0 (i32.const 10) "nas\00\00\00aka yzkx waka wakm\00\00\00\00\00\00C")
(func $call-indirect (type $v)
(i32.store8
(i32.const 40)
(i32.const 67)
)
)
(export "memory" (memory $m))
)
3 changes: 2 additions & 1 deletion test/ctor-eval/indirect-call3.wast
Original file line number Diff line number Diff line change
@@ -1,11 +1,12 @@
(module
(import "env" "_abort" (func $_abort))
(type $v (func))
(memory 256 256)
(memory $m 256 256)
(data (i32.const 10) "waka waka waka waka waka")
(table 2 2 funcref)
(elem (i32.const 0) $_abort $call-indirect)
(export "test1" (func $test1))
(export "memory" (memory $m)) ;; export memory so we can see the updated data
(func $test1
(call_indirect (type $v) (i32.const 1)) ;; safe to call
(i32.store8 (i32.const 20) (i32.const 120))
Expand Down
Loading
Loading