Skip to content

Initial implementation of "Memory64" proposal #3130

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Sep 18, 2020

Conversation

aardappel
Copy link
Contributor

Also includes a lot of new spec tests that eventually need to go into the spec repo.

@aardappel aardappel requested review from kripken and sbc100 September 14, 2020 20:35
@aardappel
Copy link
Contributor Author

As I mentioned, this contains a lot of fixes to test scripts and such, which aren't strictly part of Memory64 of course, but were required for me to get all this up and running on Windows. Some could be pulled out into a PR to be merged before this one (since I can't test this functionality without them), but would be simpler to just merge them here.

@aardappel aardappel force-pushed the master branch 2 times, most recently from bf6d9b2 to 39a1262 Compare September 15, 2020 00:10
@dcodeIO
Copy link
Contributor

dcodeIO commented Sep 15, 2020

Took a look at the emcc test error fwiw and found that it is happening at

theHost.type = binaryen.f64;
theHost.finalize();
assert(theHost.type === binaryen.i32);

indicating that calling finalize() on a Host expression representing a MemoryGrow operation does not update the expression's type based on its operands etc., but keeps potentially invalid types.

@aardappel
Copy link
Contributor Author

@dcodeIO thanks so much for that insight! I did change finalize to take an extra parameter, but at the time I thought this was ok because a) there were other types doing this and b) surely I'd get a compile error if it wasn't. How naive :P

@aardappel aardappel force-pushed the master branch 2 times, most recently from c9c9af3 to 1506bbd Compare September 15, 2020 16:24
@aardappel
Copy link
Contributor Author

Ok, changed the way Host::finalize works and is called to satisfy the tests, but it is very broken. What really needs to happen is for it to use the current pointer size, which depends on the current memory/module, which is not accessible from most of these functions calling finalize. We could stick an is64 bool in the visitors calling this function, e.g. BinaryenExpressionFinalize would need to take a module argument, changing the JS API etc.

Copy link
Member

@sbc100 sbc100 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Surprisingly small change (tests aside)! Even makes some things simpler.

lgtm % comments

@dcodeIO
Copy link
Contributor

dcodeIO commented Sep 15, 2020

I remember that there has been the idea to refactor Host into MemorySize and MemoryGrow, matching the other Memory* expressions. Not sure how much sense it makes in context, but if something needs to be changed anyway, perhaps refactoring these and also make them take the current memory size as an argument upon construction might be helpful, so all the information necessary to finalize is present?

Copy link
Member

@kripken kripken left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work! Some initial comments here.

I think it would be worth splitting out the unrelated windows test improvements. I'd be fine with them all in one big "fix windows" PR.

@aardappel
Copy link
Contributor Author

Ok, non-Memory64 related changes have been split into a separate PR, please review that one first, as it is required for me to be able to test this one: #3142

@aardappel aardappel force-pushed the master branch 3 times, most recently from 1f1a95e to df53f58 Compare September 17, 2020 19:12
@aardappel
Copy link
Contributor Author

Ok, rebased to not include the Windows/test changes, and most comments above resolved, please review.

@aardappel aardappel force-pushed the master branch 4 times, most recently from a3a7267 to 78edc1a Compare September 17, 2020 21:23
@tlively
Copy link
Member

tlively commented Sep 17, 2020

Ok, changed the way Host::finalize works and is called to satisfy the tests, but it is very broken. What really needs to happen is for it to use the current pointer size, which depends on the current memory/module, which is not accessible from most of these functions calling finalize. We could stick an is64 bool in the visitors calling this function, e.g. BinaryenExpressionFinalize would need to take a module argument, changing the JS API etc.

Why is the current memory or module not accessible from functions calling finalize? Where are these callers?

@aardappel
Copy link
Contributor Author

@tlively BinaryenExpressionFinalize is an example. It would need to take a BinaryenModuleRef arg, changing the JS interface etc.

@tlively
Copy link
Member

tlively commented Sep 17, 2020

That sounds fine to me. We will have to do something similar when we expose Poppy IR to the C/JS API because that changes block validation as well. Maybe we can introduce a ValidationConfig object in the C/JS API where we can put all this validation context.

@aardappel aardappel force-pushed the master branch 2 times, most recently from 29caea2 to e1398c7 Compare September 17, 2020 22:01
@aardappel
Copy link
Contributor Author

Omg, CI beast has been slain!

@tlively
Copy link
Member

tlively commented Sep 17, 2020

@aardappel can you push additional commits instead of force pushing? It makes it hard to figure out what has been changed from version to version.

@sbc100
Copy link
Member

sbc100 commented Sep 17, 2020

@aardappel can you push additional commits instead of force pushing? It makes it hard to figure out what has been changed from version to version.

I know you are not talking to me here but as someone who uses a "rebase early and squash often" approach in all my repos I would find this request hard.

I suppose I could still do what I want with my local branch and then every time I want to push a change to github I could just push the delta between what I have in my branch and what github currently has? i.e. have my work branch and the github branch be basically unrelated. I guess I could write a script for that.

@aardappel
Copy link
Contributor Author

@tlively they are often very tiny formatting changes and what not, and make it harder for me to see locally what's all in a PR.

@tlively
Copy link
Member

tlively commented Sep 17, 2020

Ok, I don't want to mess with your normal workflow too much. GitHub's "changes since last review" feature is great, but I wish it worked with with rebases. When I need to see everything that's in a PR I usually just do git diff master (or just look at GitHub).

@aardappel
Copy link
Contributor Author

@tlively I use a local git tool (Fork).

value64 + offset64 <= uint64_t(std::numeric_limits<int32_t>::max())) {
last->value = Literal(int32_t(value64 + offset64));
if (getModule()->memory.is64) {
last->value = Literal(int64_t(value64 + offset64));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the danger is still relevant in memory64, if it wraps doesn't it trap? (what does the memory64 spec do there?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe we haven't decided on that yet, see: WebAssembly/memory64#3

I'd be happy to add any checks.. but for now doing the simplest thing seems fine to me.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I didn't realize the state of the spec... sounds fine then, but maybe add a comment that this is in flux?

@@ -35,10 +35,11 @@ struct NameType {

class Builder {
MixedArena& allocator;
bool is64;

public:
Builder(MixedArena& allocator) : allocator(allocator) {}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the first constructor looks dangerous, as it won't set is64. we should maybe remove that constructor entirely (separately from this PR). meanwhile, it would be good to at least check in makeConstPtr if we know the memory size.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It now takes a module, and made sure all users of Builder always supply a valid module

@@ -35,10 +35,11 @@ struct NameType {

class Builder {
MixedArena& allocator;
bool is64;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd suggest storing a Module& here instead of is64. is64 duplicates info that might get stale, if we add a pass to convert from 64 to 32 or vice versa etc.

Or this could be a Module* defaulting to nullptr which would help with the issue from a few lines down.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All callers have a valid module, so wasn't necessary to make it a *

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is going to be inconsistent with Poppy IR, where individual functions rather than the module are the source of truth for the extra validation context. I would have preferred taking the risk of stale data. Most builders are not long-lived, anyhow, and most passes will not be changing the validation context in any way. I just uploaded #3144, but I would actually prefer to keep the allocator and use an is64 bool. @kripken, wdyt?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm confused, why would Poppy IR change the relationship of functions to the module?

Anyhow it's late on Friday, let's discuss this in depth on Monday?

@aardappel aardappel force-pushed the master branch 2 times, most recently from cc1b2f2 to a66377a Compare September 18, 2020 20:01
Also includes a lot of new spec tests that eventually need to go into the spec repo
Copy link
Member

@kripken kripken left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice!

I didn't go through each spec test manually. I figure we'll go through those carefully upstream anyhow. Also once we get fuzzing integration for wasm64 that should be a good check on them.

Builder(MixedArena& allocator) : allocator(allocator) {}
Builder(Module& wasm) : allocator(wasm.allocator) {}
Builder(MixedArena& allocator, Module& wasm)
: allocator(allocator), wasm(wasm) {}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a followup I think we can remove this constructor. The allocator should always be from the wasm, so there is no need to pass an allocator manually.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did see at least one use where it seemed that a single allocator was used for potentially processing multiple modules, and I didn't know how intentional that was, so I left it in.

@@ -486,6 +488,9 @@ class Builder {
ret->finalize();
return ret;
}
Const* makeConstPtr(uint64_t val) {
return makeConst(Literal::makeFromUInt64(val, wasm.memory.indexType));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like how this looks!

@@ -1070,6 +1068,9 @@ class MemorySize : public SpecificExpression<Expression::MemorySizeId> {
MemorySize() { type = Type::i32; }
MemorySize(MixedArena& allocator) : MemorySize() {}

Type ptrType = Type::i32;

void make64();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general we don't like adding methods to these classes, to keep them minimalistic. But I can't think of a better pattern, and these are rarely-used anyhow, so sgtm.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried adding an argument to the constructor, but that didn't work with the alloc template, so did this instead to keep it simple.

@aardappel aardappel merged commit 3b4cb93 into WebAssembly:master Sep 18, 2020
tlively added a commit to tlively/binaryen that referenced this pull request Sep 19, 2020
Builders gained a `Module` field in WebAssembly#3130 because they now require extra context
to properly finalize some Expressions. Since modules contain allocators, the old
allocator field on the builder became redundant after that change. This PR
removes the redundant allocator field.
tlively added a commit that referenced this pull request Sep 22, 2020
Builders gained a `Module` field in #3130 because they now require extra context
to properly finalize some Expressions. Since modules contain allocators, the old
allocator field on the builder became redundant after that change. This PR
removes the redundant allocator field.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants