Support types like strings in closures #104

alexcrichton · 2018-04-05T20:17:10Z

The first support for closures is being added in https://github.com/alexcrichton/wasm-bindgen/pull/101, but it's very bare-bones. Currently it doesn't allow working with arguments/return values that are any more interesting than the four fundamental wasm types themselves (and even then u64 won't work).

Let's take a look at an example:

#[wasm_bindgen]
extern {
    fn foo(a: &Fn(&str));
}

Here we're calling a JS function, foo, which takes a closure. This closure will have a stack lifetime (not persistent beyond this one function call). The closure also receives one argument, a string.

First up we need to generate a bunch of shims. The ones in Rust need to look like:

fn foo(a: &Fn(&str)) {
    extern fn shim(data: *const u8, a: u32) {
        let a = extract_string(a); // a bit more fancy than this, but mostly the same
        (*(data as *const &Fn(&str)))(a)
    }

   extern {
        fn foo_import(data: *const u8, shim: u32);
    }
    unsafe {
        foo_import(&a as *const &Fn(&str) as *const u8, shim as u32)
    }
}

This part is relatively straightforward and easy to do today with the construction of wasm-bindgen, heavily relying on the trait system. In other words all this functionality can be wrapped up in trait implementations that are easier to read/write/document than generated code.

The problem here comes when we want to generate the JS shims, and unfortunately comes back to the trait system. Somehow we need to instruct JS to generate a shim that takes the arguments passed here and generate a custom JS wrapper which wraps the rust function pointer passed to JS here.

Currently the way wasm-bindgen works is that it emits a JSON AST effectively for the wasm-bindgen CLI tool to later come, parse, and generate bindings for. This JSON AST is constructed at compile time by the #[wasm_bindgen] attribute. This is done by making a giant const and slapping #[wasm_custom_section] on that value.

The way this all works is a bit janky though and ends up preventing the closure case from working well. An import like above will have a list of arguments associated with it. We rely on the trait system to use an associated type projection to figure out what the actual argument is here. In other words we're building this giant JSON literal, and somewhere in the middle of it we're splicing in some content from a projection of a particular type (aka <&Fn(&str) as ToRefWasmBoundary>::DESCRIPTOR).

Unfortunately the "splicing operation" is currently fixed size, and is currently just 4 bytes. That means we have 4 bytes of space to encode everything about this function signature (aka a closure that takes a string as an argument). These 4 bytes also need to be valid utf8 when emitted (as the whole thing is interpreted as utf8 JSON later on). The number 4 here can change but it currently must be fixed.

So overall the problem is how, in the trait impl of ToRefWasmBoundary for Fn(A), do we manufacture a DESCRIPTOR? Concatenation isn't easy when we're concatenating projectsions of sub-types (aka A) as well. This is where I'm at a bit of a loss...

I'm not really sure how to proceed here, but help is always welcome!

The text was updated successfully, but these errors were encountered:

alexcrichton · 2018-04-05T20:24:49Z

One semi-crazy idea I've had is that we could generate programs instead of JSON descriptors. For example we could generate something which, when called, would yield a description of the export here.

I think we'd have a lot more flexibility in what we do (aka full runtime computation rather than being limited to simply what we can do with #[wasm_custom_section]). The crazy part is that we'd run these programs when we run wasm-bindgen. Afterwards we'd use wasm-gc to automatically strip the programs from the end binary.

The more I think on it the more I conclude that this may be the only real approach, but hoo boy would that be a heavyweight implementation of this.

This commit is a complete overhaul of how the `#[wasm_bindgen]` macro communicates type information to the CLI tool, and it's done in a somewhat... unconventional fashion. Today we've got a problem where the generated JS needs to understand the types of each function exported or imported. This understanding is what enables it to generate the appropriate JS wrappers and such. We want to, however, be quite flexible and extensible in types that are supported across the boundary, which means that internally we rely on the trait system to resolve what's what. Communicating the type information historically was done by creating a four byte "descriptor" and using associated type projections to communicate that to the CLI tool. Unfortunately four bytes isn't a lot of space to cram information like arguments to a generic function, tuple types, etc. In general this just wasn't flexible enough and the way custom references were treated was also already a bit of a hack. This commit takes a radical step of creating a **descriptor function** for each function imported/exported. The really crazy part is that the `wasm-bindgen` CLI tool now embeds a wasm interpreter and executes these functions when the CLI tool is invoked. By allowing arbitrary functions to get executed it's now *much* easier to inform `wasm-bindgen` about complicated structures of types. Rest assured though that all these descriptor functions are automatically unexported and gc'd away, so this should not have any impact on binary sizes A new internal trait, `WasmDescribe`, is added to represent a description of all types, sort of like a serialization of the structure of a type that `wasm-bindgen` can understand. This works by calling a special exported function with a `u32` value a bunch of times. This means that when we run a descriptor we effectively get a `Vec<u32>` in the `wasm-bindgen` CLI tool. This list of integers can then be parsed into a rich `enum` for the JS generation to work with. This commit currently only retains feature parity with the previous implementation. I hope to soon solve issues like #123, #104, and #111 with this support.

This commit adds support for closures with arguments like strings and such. In other words, closures passed to JS can now have the same suite of arguments as all functions that can be exported from Rust, as one might expect! At this time due to the way trait objects work closures still cannot use types with references like `&str`, but bare values like `String` or `ImportedType` should work just fine. Closes #104

alexcrichton added help wanted We could use some help fixing this issue! closures Adding more support for closures on the boundary labels Apr 5, 2018

alexcrichton mentioned this issue Apr 5, 2018

Initial support for closures #101

Merged

alexcrichton mentioned this issue Apr 12, 2018

Allow returning Vec<T> #111

Closed

alexcrichton mentioned this issue Apr 13, 2018

Overhaul how type information gets to the CLI #124

Merged

alexcrichton mentioned this issue Apr 14, 2018

Overhaul the conversion traits #129

Closed

alexcrichton mentioned this issue Apr 15, 2018

Support closures with "rich" arguments #132

Merged

alexcrichton closed this as completed in #132 Apr 16, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support types like strings in closures #104

Support types like strings in closures #104

alexcrichton commented Apr 5, 2018

alexcrichton commented Apr 5, 2018

Support types like strings in closures #104

Support types like strings in closures #104

Comments

alexcrichton commented Apr 5, 2018

alexcrichton commented Apr 5, 2018