-
Notifications
You must be signed in to change notification settings - Fork 786
Remove FunctionType #2510
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove FunctionType #2510
Conversation
Function signatures were previously redundantly stored on Function objects as well as on FunctionType objects. These two signature representations had to always be kept in sync, which was error-prone and needlessly complex. This PR takes advantage of the new ability of Type to represent multiple value types by consolidating function signatures as a pair of Types (params and results) stored on the Function object. Since there are no longer module-global named function types, significant changes had to be made to the printing and emitting of function types, as well as their parsing and manipulation in various passes. The C and JS APIs and their tests also had to be updated to remove named function types.
This needs to be rebased, but is otherwise good to go (i.e. tests pass). |
Great! Starting to read now. Btw, I'd recommend running the emscripten test suite on this, if you haven't already. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Excellent work!
Some initial comments until tomorrow.
src/ir/ExpressionManipulator.cpp
Outdated
for (Index i = 0; i < curr->operands.size(); i++) { | ||
ret->operands.push_back(copy(curr->operands[i])); | ||
} | ||
ret->type = curr->type; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this needed because makeCallIndirect
no longer receives the type
? I think perhaps it should still receive it. We don't want to risk people forgetting to set it after calling (make*
should return a fully-formed node), and in general, where the type can't be inferred from the inputs, we add the type as an extra input (e.g. makeLocalGet
, makeCall
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
makeCallIndirect
performs finalization, which fills out the type
using the passed Signature
. The problem here is that if any of the operands is unreachable, the type
should also be unreachable, but since the operands are added after finalization we need to propagate that information manually.
The code would be cleaner if the operands were passed as part of the builder call so finalization can take care of this. I'll make that change.
src/ir/module-utils.h
Outdated
typedef std::unordered_map<Signature, size_t> Counts; | ||
ModuleUtils::ParallelFunctionAnalysis<Counts> analysis( | ||
wasm, [&](Function* func, Counts& counts) { | ||
if (func->imported()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
don't we need imported function's types too?
src/ir/module-utils.h
Outdated
return; | ||
} | ||
struct TypeCounter : PostWalker<TypeCounter> { | ||
Module& wasm; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need for wasm
in this entire struct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(another set of comments)
src/passes/Print.cpp
Outdated
@@ -1865,18 +1880,18 @@ struct PrintSExpression : public OverriddenVisitor<PrintSExpression> { | |||
o << ')'; | |||
} | |||
// Module-level visitors | |||
void visitFunctionType(FunctionType* curr, Name* internalName = nullptr) { | |||
void visitSignature(Signature curr, Name* funcName = nullptr) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this looks like a standard visitor, but it isn't, I think? maybe another name would make it less likely to confuse. like handleSignature
?
src/wasm/wasm-emscripten.cpp
Outdated
if (!sigs.insert(sig).second) { | ||
return; // sig is already in the set | ||
} | ||
Name name = std::string("dynCall_") + sig; | ||
Name name = std::string("dynCall_") + "TODO: generate sig string"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the status of this TODO?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Embarrassingly skipped. Thankfully it's a trivial fix.
src/wasm/wasm.cpp
Outdated
bool Function::isVar(Index index) { | ||
auto base = getVarIndexBase(); | ||
// TODO: update me | ||
return base <= index /*&& index < base + vars.size()*/; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the eventual update here should be to make that last part an assertion - it is invalid to ask avoid an invalid index.
test/anyref.wast.from-wast
Outdated
@@ -1,11 +1,11 @@ | |||
(module | |||
(type $FUNCSIG$aa (func (param anyref) (result anyref))) | |||
(type "anyref -> anyref" (func (param anyref) (result anyref))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it is slightly unfortunate to print out a type that is not visibly used anywhere, like in this test (i like the change to not print the type on the function, though!)
can be as a followup though if it's not trivial.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I actually don't mind having this section because it's easy to ignore if I want but sometimes it's nice to have an overview of all the function types in one place. But changing this would require a separate ParallelAnalysisPass, so lets defer that discussion to a followup.
Thanks for you comments so far! I have addressed all of them. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code looks good to me aside from the issue of the text emitted not being valid wat.
I viewed 25% of the tests and they look fine...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice simplifying changes!
src/asm2wasm.h
Outdated
if (builtin) { | ||
import->type = builtin->name; | ||
Signature builtin = getBuiltinSignature(import->module, import->base); | ||
if (builtin != Signature()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does an empty signature mean? Does that mean (none, none)? If that's the case, Type::id
field currently does not seem to be initialized with 0. Shouldn't we?
On second thought, do we need this condition? Can't we just do
import->sig = getBuiltinSignature(import->module, import->base);
This will include an empty signature case as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, this is rather ugly. Empty signature is indeed (none, none). I thought that was what the default constructor was doing, but I guess I don't know for sure. I'll change it to explicitly initialize params and results to none.
getBuiltinSignature
is meant to return an optional signature, and in the absence of std::optional
I was just using the default signature to mean the absence of a builtin signature. But this is an ugly hack because there presumably could at some point be a builtin JS function with signature (none, none). I will change this to take a pointer to the result signature and return a bool indicating whether the signature exists. This is a more idiomatic ugly hack.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, I see. About the default constructor of Type
, wouldn't appending = 0;
to this line be sufficient?
src/asmjs/asm_v_wasm.cpp
Outdated
@@ -120,33 +126,6 @@ Type sigToType(char sig) { | |||
} | |||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we still need these getSig
and sigToType
functions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
getSig
still used in wasm-emscripten.cpp, but it turns out that sigToType
could be deleted entirely.
// indices. Signatures are sorted in order of decreasing frequency to minize the | ||
// size of their collective encoding. Both a vector mapping indices to | ||
// signatures and a map mapping signatures to indices are produced. | ||
inline void |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this inline?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the bad reason that it is in a header, so the linker would yell at us if it were not inline because it is included into multiple compilation units. This is a terrible reason, but I feel that adding a new .cpp file for the implementations in this file should be a separate PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh I see. Yeah that should be a separate PR if we decide to do that, but I don't think that's problem only for this file either (we have many header implementations throughout the codebase), so I'm not suggesting you do that or anything 😂
src/ir/module-utils.h
Outdated
// order by frequency then simplicity | ||
if (a.second != b.second) { | ||
return a.second > b.second; | ||
} else { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: no need for else
@@ -300,7 +298,7 @@ struct AutoDrop : public WalkerPass<ExpressionStackWalker<AutoDrop>> { | |||
void doWalkFunction(Function* curr) { | |||
ReFinalize().walkFunctionInModule(curr, getModule()); | |||
walk(curr->body); | |||
if (curr->result == none && curr->body->type.isConcrete()) { | |||
if (curr->sig.results == Type::none && curr->body->type.isConcrete()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just curious, when do we prefix types with Type::
and when we don't? Are you gonna land #2434?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So your suggestion is basically wherever we add new code we append Type::
?
src/wasm/wasm-s-parser.cpp
Outdated
} | ||
|
||
Signature inlineSig{Type(params), Type(results)}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: style issue, but we don't seem to use this {}
constructor call in Binaryen. (LLVM prevents it too)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unforunately this line parses as a function definition if I use parens here 🙄 I guess I can fix it by using an =
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah then this should be fine, sorry.
FunctionType* fakeFunctionType; // just co call parseTypeUse | ||
i = parseTypeUse(s, i, fakeFunctionType, paramTypes, results); | ||
event->sig = Signature(Type(paramTypes), results); | ||
i = parseTypeUse(s, i, event->sig); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
src/wasm/wasm-validator.cpp
Outdated
if (shouldBeTrue(curr->index < getFunction()->getNumLocals(), | ||
curr, | ||
"local.set index must be small enough")) { | ||
; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo?
shouldBeEqual(getFunction()->getLocalType(curr->index), | ||
curr->value->type, | ||
curr, | ||
"local.set type must match function"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why move these into if (shouldBeTrue(curr->index < getFunction()->getNumLocals() ...)
? It looks these can be checked independently from the local index failure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because it is now an assertion failure to call Function::getLocalType
with an out-of-bounds index.
@@ -411,9 +411,8 @@ Ref Wasm2JSBuilder::processWasm(Module* wasm, Name funcName) { | |||
asmFunc[3]->push_back(processFunction( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
params
and vars
does not seem to be used anywhere, so we can delete them (not this line but the line above. Github doesn't let me comment on that line...)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice catch!
This allows the signatures to be referenced by index even though they weren't named, which is apparently allowed.
I'm currently battling bugs that are only exposed by running the Emscripten test suite (so it's a good thing I'm doing that!). I'll land this once I have those all fixed. |
Function signatures were previously redundantly stored on Function
objects as well as on FunctionType objects. These two signature
representations had to always be kept in sync, which was error-prone
and needlessly complex. This PR takes advantage of the new ability of
Type to represent multiple value types by consolidating function
signatures as a pair of Types (params and results) stored on the
Function object.
Since there are no longer module-global named function types,
significant changes had to be made to the printing and emitting of
function types, as well as their parsing and manipulation in various
passes.
The C and JS APIs and their tests also had to be updated to remove
named function types.