Skip to content

Restructure web bindings #569

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

benlongo
Copy link

@benlongo benlongo commented Mar 6, 2020

This PR was initially motivated by #555.

Currently, the output of emscripten is wrapped with prefix.js and binding.js using --pre-js and --post-js code to create the module. This is nice because no extra dependencies are required. However this is not the intended use of these options (see bullet 2 emscripten-core/emscripten#6814), and the typescript declarations must be manually maintained. Additionally, not much control is provided on the output artifacts and the bindings must be written in javascript. Although it's a minor nitpick, since neither prefix.js or binding.js are syntactically valid js, it's awkward to add standard tooling such as linters.

In an attempt to remedy these issues, this PR makes the following changes:

  1. Utilize the EXPORT_ES6 emscripten option to create an es6 module.
  2. Combine prefix.js and binding.js into a single file that imports the generated emscripten module.
  3. Convert the combined bindings to TypeScript.
  4. rollup.js to build the final js file and automatically output the ts declarations.
  5. Add eslint and prettier to enforce formatting a code style

Converting the bindings to TypeScript and then generating the declarations uncovered several mistakes in the manually declared bindings, which is some evidence for the benefit of this approach (see also #498). However, this adds a significant number of deps and complicates the build process so it may not be worth it.

Using rollup also provides more flexibility in the output artifacts - it would be trivial to add separate minified output just for browsers and unminified output just for es6 modules.

Although all the tests are passing, I'm not positive this preserves all current behavior, hence the WIP. Also, a significant proportion of the diff is purely cosmetic changes resulting from prettier + eslint.

@maxbrunsfeld
Copy link
Contributor

maxbrunsfeld commented Mar 6, 2020

Thanks for calling out that issue about the intended use of --post-js and --pre-js. Using EXPORT_ES6 and MODULARIZE_INSTANCE seems like a very good change.

I'd prefer to deal with this separately from all of the the other changes that you've made on this branch though: the changes that deal with typescript, eslint, prettier, and all of code style. Would you be ok with backing those changes out of this branch? I'd love to merge a PR that just achieves the following:

  1. removing the use of --pre-js and --post-js
  2. combining binding.js and prefix.js into one source file
  3. Updates the build script to work with the new setup, as you have done

I'm still open to the typescript stuff, and potentially the prettier stuff in the future, but it's something I think we should tackle in a separate PR in order to reduce the risk.

@benlongo
Copy link
Author

benlongo commented Mar 6, 2020

I'd be happy to split up the PR into distinct stages - this one is definitely quite big and risky. Would you want the fixes to the existing typescript declarations also broken out into a separate PR?

@maxbrunsfeld
Copy link
Contributor

maxbrunsfeld commented Mar 6, 2020

Oh I didn't realize that you had fixed the declarations as well. My main concern is that this PR should just be about the emscripten build config.

For the typescript stuff, I'm not 100% sure yet.

Converting the bindings to TypeScript and then generating the declarations uncovered several mistakes in the manually declared bindings, which is some evidence for the benefit of this approach (see also #498)

However, this adds a significant number of deps and complicates the build process so it may not be worth it.

I think that these are both very good points, and well said. The build-time complexity was something I had been trying to avoid, but maybe it is worth it in order to make it easier to maintain the type declarations.

To me, it seems reasonable to keep the work that you've done on the TypeScript conversion, at least for now, rather than separating out the fixed declarations. After merging this PR (with just the emscripten changes), you could open a separate PR with the rest of your changes, and I'll feel a little more confident in evaluating the tradeoff of build complexity vs code maintainability.

Also, thanks a lot for exploring these issues; this is great. It makes sense that you wanted to explore multiple issues at a time, at least at first; hopefully it's not too onerous to separate it out after-the-fact.

@benlongo
Copy link
Author

benlongo commented Mar 6, 2020

To me, it seems reasonable to keep the work that you've done on the TypeScript conversion, at least for now, rather than separating out the fixed declarations. After merging this PR (with just the emscripten changes), you could open a separate PR with the rest of your changes, and I'll feel a little more confident in evaluating the tradeoff of build complexity vs code maintainability.

Agreed, I'll leave the TypeScript changes in a separate PR.

Also, thanks a lot for exploring these issues; this is great. It makes sense that you wanted to explore multiple issues at a time, at least at first; hopefully it's not too onerous to separate it out after-the-fact.

Yeah I went down many paths that didn't work at first and ended up here, but now that I have something that appears to work well it shouldn't be too much effort to go back and pull out the individual pieces.

@Menci
Copy link

Menci commented Mar 7, 2020

Good work. I think the API to provide WASM file location we wanted in #559 can be supported easily after this PR.

Emscripten can generate the glue code that exports a module with the newly added option -s MODULARIZE_INSTANCE=1. This behavior is just like the original behavior of web-tree-sitter module -- the glue code starts trying to load the WASM file right after the glue code's file is loaded. So we don't have a good way to let the user pass a Module.locateFile function to the glue code.

If we change the option to -s MODULARIZE=1, the generated glue code will export a factory function of the module, instead of the module instance. This function accepts an object paramater to receive options for the module instance and then start finding the WASM file. Therefore, we can call it in Parser.init(wasmPath) and pass a { locateFile: () => wasmPath } to it.

Not too much code will be changed -- just move the construction of the global promise to the Parser.init function, and assign the result of the module factory function to the global variable C.

@benlongo
Copy link
Author

benlongo commented Mar 7, 2020

This function accepts an object paramater to receive options for the module instance and then start finding the WASM file. Therefore, we can call it in Parser.init(wasmPath) and pass a { locateFile: () => wasmPath } to it.

Oh perfect, didn't know that was an option for that. I can integrate that into one of the PR stages.

@benlongo
Copy link
Author

benlongo commented Mar 7, 2020

Now that #570 is up, I'm gonna close this one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants