Skip to content

Remove FunctionType #2510

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 18 commits into from
Dec 12, 2019
Merged

Remove FunctionType #2510

merged 18 commits into from
Dec 12, 2019

Conversation

tlively
Copy link
Member

@tlively tlively commented Dec 6, 2019

Function signatures were previously redundantly stored on Function
objects as well as on FunctionType objects. These two signature
representations had to always be kept in sync, which was error-prone
and needlessly complex. This PR takes advantage of the new ability of
Type to represent multiple value types by consolidating function
signatures as a pair of Types (params and results) stored on the
Function object.

Since there are no longer module-global named function types,
significant changes had to be made to the printing and emitting of
function types, as well as their parsing and manipulation in various
passes.

The C and JS APIs and their tests also had to be updated to remove
named function types.

Function signatures were previously redundantly stored on Function
objects as well as on FunctionType objects. These two signature
representations had to always be kept in sync, which was error-prone
and needlessly complex. This PR takes advantage of the new ability of
Type to represent multiple value types by consolidating function
signatures as a pair of Types (params and results) stored on the
Function object.

Since there are no longer module-global named function types,
significant changes had to be made to the printing and emitting of
function types, as well as their parsing and manipulation in various
passes.

The C and JS APIs and their tests also had to be updated to remove
named function types.
@tlively tlively requested review from kripken and aheejin December 6, 2019 03:07
@tlively
Copy link
Member Author

tlively commented Dec 6, 2019

This needs to be rebased, but is otherwise good to go (i.e. tests pass).

@kripken
Copy link
Member

kripken commented Dec 6, 2019

Great! Starting to read now.

Btw, I'd recommend running the emscripten test suite on this, if you haven't already.

Copy link
Member

@kripken kripken left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent work!

Some initial comments until tomorrow.

for (Index i = 0; i < curr->operands.size(); i++) {
ret->operands.push_back(copy(curr->operands[i]));
}
ret->type = curr->type;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this needed because makeCallIndirect no longer receives the type? I think perhaps it should still receive it. We don't want to risk people forgetting to set it after calling (make* should return a fully-formed node), and in general, where the type can't be inferred from the inputs, we add the type as an extra input (e.g. makeLocalGet, makeCall).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

makeCallIndirect performs finalization, which fills out the type using the passed Signature. The problem here is that if any of the operands is unreachable, the type should also be unreachable, but since the operands are added after finalization we need to propagate that information manually.

The code would be cleaner if the operands were passed as part of the builder call so finalization can take care of this. I'll make that change.

typedef std::unordered_map<Signature, size_t> Counts;
ModuleUtils::ParallelFunctionAnalysis<Counts> analysis(
wasm, [&](Function* func, Counts& counts) {
if (func->imported()) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't we need imported function's types too?

return;
}
struct TypeCounter : PostWalker<TypeCounter> {
Module& wasm;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need for wasm in this entire struct?

Copy link
Member

@kripken kripken left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(another set of comments)

@@ -1865,18 +1880,18 @@ struct PrintSExpression : public OverriddenVisitor<PrintSExpression> {
o << ')';
}
// Module-level visitors
void visitFunctionType(FunctionType* curr, Name* internalName = nullptr) {
void visitSignature(Signature curr, Name* funcName = nullptr) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this looks like a standard visitor, but it isn't, I think? maybe another name would make it less likely to confuse. like handleSignature?

if (!sigs.insert(sig).second) {
return; // sig is already in the set
}
Name name = std::string("dynCall_") + sig;
Name name = std::string("dynCall_") + "TODO: generate sig string";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the status of this TODO?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Embarrassingly skipped. Thankfully it's a trivial fix.

bool Function::isVar(Index index) {
auto base = getVarIndexBase();
// TODO: update me
return base <= index /*&& index < base + vars.size()*/;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the eventual update here should be to make that last part an assertion - it is invalid to ask avoid an invalid index.

@@ -1,11 +1,11 @@
(module
(type $FUNCSIG$aa (func (param anyref) (result anyref)))
(type "anyref -> anyref" (func (param anyref) (result anyref)))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is slightly unfortunate to print out a type that is not visibly used anywhere, like in this test (i like the change to not print the type on the function, though!)

can be as a followup though if it's not trivial.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually don't mind having this section because it's easy to ignore if I want but sometimes it's nice to have an overview of all the function types in one place. But changing this would require a separate ParallelAnalysisPass, so lets defer that discussion to a followup.

@tlively
Copy link
Member Author

tlively commented Dec 6, 2019

Thanks for you comments so far! I have addressed all of them.

Copy link
Member

@kripken kripken left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code looks good to me aside from the issue of the text emitted not being valid wat.

I viewed 25% of the tests and they look fine...

Copy link
Member

@aheejin aheejin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice simplifying changes!

src/asm2wasm.h Outdated
if (builtin) {
import->type = builtin->name;
Signature builtin = getBuiltinSignature(import->module, import->base);
if (builtin != Signature()) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does an empty signature mean? Does that mean (none, none)? If that's the case, Type::id field currently does not seem to be initialized with 0. Shouldn't we?

On second thought, do we need this condition? Can't we just do

import->sig = getBuiltinSignature(import->module, import->base);

This will include an empty signature case as well.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this is rather ugly. Empty signature is indeed (none, none). I thought that was what the default constructor was doing, but I guess I don't know for sure. I'll change it to explicitly initialize params and results to none.

getBuiltinSignature is meant to return an optional signature, and in the absence of std::optional I was just using the default signature to mean the absence of a builtin signature. But this is an ugly hack because there presumably could at some point be a builtin JS function with signature (none, none). I will change this to take a pointer to the result signature and return a bool indicating whether the signature exists. This is a more idiomatic ugly hack.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I see. About the default constructor of Type, wouldn't appending = 0; to this line be sufficient?

@@ -120,33 +126,6 @@ Type sigToType(char sig) {
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we still need these getSig and sigToType functions?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

getSig still used in wasm-emscripten.cpp, but it turns out that sigToType could be deleted entirely.

// indices. Signatures are sorted in order of decreasing frequency to minize the
// size of their collective encoding. Both a vector mapping indices to
// signatures and a map mapping signatures to indices are produced.
inline void
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this inline?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the bad reason that it is in a header, so the linker would yell at us if it were not inline because it is included into multiple compilation units. This is a terrible reason, but I feel that adding a new .cpp file for the implementations in this file should be a separate PR.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I see. Yeah that should be a separate PR if we decide to do that, but I don't think that's problem only for this file either (we have many header implementations throughout the codebase), so I'm not suggesting you do that or anything 😂

// order by frequency then simplicity
if (a.second != b.second) {
return a.second > b.second;
} else {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: no need for else

@@ -300,7 +298,7 @@ struct AutoDrop : public WalkerPass<ExpressionStackWalker<AutoDrop>> {
void doWalkFunction(Function* curr) {
ReFinalize().walkFunctionInModule(curr, getModule());
walk(curr->body);
if (curr->result == none && curr->body->type.isConcrete()) {
if (curr->sig.results == Type::none && curr->body->type.isConcrete()) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just curious, when do we prefix types with Type:: and when we don't? Are you gonna land #2434?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would still like to land #2434, but since I started this before we came to consensus on that, I am just adding Type:: on any line that this PR changes anyway. That will make rebasing and landing #2434 easier afterwards.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So your suggestion is basically wherever we add new code we append Type::?

}

Signature inlineSig{Type(params), Type(results)};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: style issue, but we don't seem to use this {} constructor call in Binaryen. (LLVM prevents it too)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unforunately this line parses as a function definition if I use parens here 🙄 I guess I can fix it by using an =.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah then this should be fine, sorry.

FunctionType* fakeFunctionType; // just co call parseTypeUse
i = parseTypeUse(s, i, fakeFunctionType, paramTypes, results);
event->sig = Signature(Type(paramTypes), results);
i = parseTypeUse(s, i, event->sig);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

if (shouldBeTrue(curr->index < getFunction()->getNumLocals(),
curr,
"local.set index must be small enough")) {
;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo?

shouldBeEqual(getFunction()->getLocalType(curr->index),
curr->value->type,
curr,
"local.set type must match function");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why move these into if (shouldBeTrue(curr->index < getFunction()->getNumLocals() ...)? It looks these can be checked independently from the local index failure.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because it is now an assertion failure to call Function::getLocalType with an out-of-bounds index.

@@ -411,9 +411,8 @@ Ref Wasm2JSBuilder::processWasm(Module* wasm, Name funcName) {
asmFunc[3]->push_back(processFunction(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

params and vars does not seem to be used anywhere, so we can delete them (not this line but the line above. Github doesn't let me comment on that line...)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice catch!

@tlively
Copy link
Member Author

tlively commented Dec 11, 2019

I'm currently battling bugs that are only exposed by running the Emscripten test suite (so it's a good thing I'm doing that!). I'll land this once I have those all fixed.

@kripken kripken merged commit 759c485 into WebAssembly:master Dec 12, 2019
@tlively tlively deleted the function-types branch April 24, 2020 23:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants