-
Notifications
You must be signed in to change notification settings - Fork 3.4k
Fix for loading wasm files under Node.js and in browser when files we… #5368
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix for loading wasm files under Node.js and in browser when files we… #5368
Conversation
… wasm loading from NPM module (upstream PR: emscripten-core/emscripten#5368)
Some questions:
|
Here is the project I've mentioned: https://github.com/nazar-pc/supercop.wasm You can reproduce an issue by modifying |
This code now looks like it's going down the route that we wanted to avoid in PR #4942. That is, instead of having multiple locations in code that reference to |
I think having it in one place is a good idea overall, especially for some long ones like those needed in browser environment. I also think we should possibly implement common path in either PR (or even yet another one) and then rebase against that. Should I try to implement |
Can implement it in this PR, I think. Perhaps ask in the other PR if someone is working on it to be sure, but I don't think anyone is (last update there was over a month ago). |
Yeah, I got "distracted" from this area for the last month by other projects, so would be happy to see anyone implementing |
I think this can be called The ultimate fix. I've found that over time Emscripten collected a few options that were fixing this exact issue for different types of files, namely:
At least some of them (I've looked through some PRs that implemented those options) originated from the fact that people were moving files around and files stopped loading. In fact, almost always you want to load all of the files from the same directory where Now This have one major implication: I didn't deprecate Module documentation is also updated with new option (interestingly, not all of the mentioned options are mentioned there:)) This also makes #4942 unnecessary. |
Check this carefully, it might break some test with edge-cases, but shouldn't affect any real-world application. |
I think I like this, nice. Seems like a good solution to use My one concern is the reordering of priorities with that and the prefixes. One option might be to add a warning if they "collide". Another is not to allow setting both prefixes and For a change like this, we need input from @juj. He's on vacation right now, so this might need to wait a week or two. |
If they collide, more specific option wins, so
Again, this is still supported, in this case The only case when it might fail is if there is a test that deliberately defined both |
Using latest Emscripten version (with emscripten-core/emscripten#5368 on top) that leads to even smaller build size.
@juj, any objections here? |
ChangeLog.markdown
Outdated
@@ -14,6 +14,9 @@ Current trunk code | |||
- Emscripten: https://github.com/kripken/emscripten/compare/1.37.13...incoming | |||
- Emscripten-LLVM: https://github.com/kripken/emscripten-fastcomp/compare/1.37.13...incoming | |||
- Emscripten-Clang: https://github.com/kripken/emscripten-fastcomp-clang/compare/1.37.13...incoming | |||
- Added new html setting Module['scriptDirectory'] that allows customizing the URL where .wasm, .mem and some other files are located (defaults to the same location as .js file) | |||
- If Module['locateFile'] is not specified explicitly it will be defined automatically using Module['scriptDirectory'] | |||
- Module['locateFile'] priority comparing to other options decreased since it is now always defined |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The format that is used in the ChangeLog is to first have the changes, and then at the end of each changes to have the block
- To see a list of commits in the active development branch 'incoming', which have not yet been packaged in a release, see
- Emscripten: https://github.com/kripken/emscripten/compare/1.37.13...incoming
- Emscripten-LLVM: https://github.com/kripken/emscripten-fastcomp/compare/1.37.13...incoming
- Emscripten-Clang: https://github.com/kripken/emscripten-fastcomp-clang/compare/1.37.13...incoming
so these would go four lines up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
Demoting
to set up gzip compression for all assets, and this new
I'm confused that you mention about deprecating these, but still changed to bump the priority of these to take over My understanding was that the consensus was that we don't ever want to load the |
Here is what was before: var memoryInitializer = '%s';
if (typeof Module['locateFile'] === 'function') {
memoryInitializer = Module['locateFile'](memoryInitializer);
} else if (Module['memoryInitializerPrefixURL']) {
memoryInitializer = Module['memoryInitializerPrefixURL'] + memoryInitializer;
} And here is what will be after proposed changes: var memoryInitializer = '%s';
if (Module['memoryInitializerPrefixURL']) {
memoryInitializer = Module['memoryInitializerPrefixURL'] + memoryInitializer;
} else {
memoryInitializer = Module['locateFile'](memoryInitializer);
} This is what I mean by changed priority. Since Yes, this is a breaking change for some edge cases. But it doesn't break the code you've instructed other developers with, unless they're also defining other options like |
For the edge cases, could we show an error if they occur? I don't think we expect |
Taking into account that with current implementation If there is no use case where, say, |
I think if we encounter any of those |
Sorry, I wasn't clear. I meant that if the user provides both |
Makes sense, will add corresponding checks. What about deprecating |
I don't think it's good to always require having In particular, the root problem here would still remain, that if user implements their own
This breaks in emunittest suite, where I implement
I don't think there is a need to deprecate to fix the issue. These are features that developers are using, and they are tested and there is no inherent problem with them, except than a couple of added bytes. I think we should focus to fix the root problem that relative URLs for these specific files should be treated relative to the script and not the CWD. I have posted PR #5484 as a suggestion to fix this. It changes the meaning of relative URLs from these built-in functions to be interpreted with respect to the main |
src/shell.js
Outdated
} else if (ENVIRONMENT_IS_WEB) { | ||
Module['scriptDirectory'] = document.currentScript.src.split('/').slice(0, -1).join('/') + '/'; | ||
} else if (ENVIRONMENT_IS_WORKER) { | ||
Module['scriptDirectory'] = self.location.href.split('/').slice(0, -1).join('/') + '/'; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the web and in web workers, XHRs are already treated relative to the current location by default, we don't need to explicitly reinforce that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
<!doctype html>
<script src="/xhr/test.js"></script>
/xhr/test.js:
var xhr = new XMLHttpRequest();
xhr.open('GET', 'test2', true);
xhr.send(null);
The call is made to http://test/test2
instead of http://test/xhr/test2
.
Also there is <base>
tag that have influence on relative paths. I think being explicit here is a good thing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think if someone uses <base>
, they explicitly want a different base path for any resources and that shouldn't get broken.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've just clarified that the statement that was made is not true and there is a value in determining an absolute path here.
It is not required to be specified explicitly by user: https://github.com/kripken/emscripten/pull/5368/files#diff-8f35bc569987bab9a3691d986d90e75fR74
This is expected, user have to identify which environment is used and behave accordingly.
And why is this a problem? With proposed change most of the time user wouldn't need to touch this option in the first place. And if he/she ever does, I don't see any problem with specifying absolute path.
In modern applications there might be no single
This is exactly the same requirement as before. On server especially user should always infer absolute path (using
True, but they are not useful either once there is a universal |
I'd like to make this a separate comment to clarify about relative paths:
|
Added `ready()` method. New keywords added. Fresh build still uses emscripten-core/emscripten#5368
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, test_glgears_proxy_jstarget
works both in Firefox Stable and Nightly (at least here on Linux)
}; | ||
%(EXPORT_NAME)s = %(EXPORT_NAME)s.bind({ | ||
_currentScript: typeof document !== 'undefined' ? document.currentScript : undefined | ||
})%(instantiate)s; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, there is no corresponding test unfortunately. I did this after using earlier version of this PR in my projects. Not a perfect or bullet-proof solution, but it works more often than without it.
} else if (Module['memoryInitializerPrefixURL']) { | ||
memoryInitializer = Module['memoryInitializerPrefixURL'] + memoryInitializer; | ||
} | ||
memoryInitializer = Module['locateFile'] ? Module['locateFile'](memoryInitializer, '') : memoryInitializer; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not very familiar with this part of the code base and it turns out that in tests Module
is not eventual module, but rather an object that real Module
will extend. Hence there is no Module._locateFile
at this point yet.
.. js:attribute:: Module.locateFile | ||
|
||
If set, this method will be called when the runtime needs to load a file, such as a ``.wasm`` WebAssembly file, ``.mem`` memory init file, or a file generated by the file packager. The function receives the URL, and should return the actual URL. This lets you host file packages or the ``.mem`` file etc. on a different location than the current directory (which is the default expectation), for example if you want to host them on a CDN. Note that ``locateFile`` is sort of a generalization of ``Module.filePackagePrefixURL``. | ||
If set, this method will be called when the runtime needs to load a file, such as a ``.wasm`` WebAssembly file, ``.mem`` memory init file, or a file generated by the file packager. The function receives the relative path to the file as configured in build process and a prefix (path to JavaScript file), and should return the actual URL. This lets you host file packages or the ``.mem`` file etc. on a different location than the directory of JavaScript file (which is the default expectation), for example if you want to host them on a CDN. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, there are some places where Module['locateFile']
is called, but Module
itself was not initialized yet. I'd like to see this re-designed this somehow, but not sure if it is possible.
I've added a note in latest commits about these cases to the degree I understand them.
src/shell.js
Outdated
#endif | ||
// `/` should be present at the end if `scriptDirectory` is not empty | ||
var scriptDirectory = ''; | ||
#if !SUPPORT_BASE64_EMBEDDING |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't fully understand how it works when SUPPORT_BASE64_EMBEDDING
landed, but it seems like there were some guards added in relevant places, I'll try to remove this check and see if any tests fail.
src/shell.js
Outdated
#if !SUPPORT_BASE64_EMBEDDING | ||
#if ENVIRONMENT_MAY_BE_NODE | ||
if (ENVIRONMENT_IS_NODE) { | ||
scriptDirectory = __dirname + '/'; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It can be, but I'd prefer to keep everything related to Module._locateFile
next to each other, this way it is easier to follow execution flow.
src/shell.js
Outdated
} | ||
#endif | ||
#endif // SUPPORT_BASE64_EMBEDDING | ||
Module._locateFile = function (path) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can, applied this change in one of latest commits.
Now different test |
@nazar-pc testing under Electron in the renderer (web) process, I don't believe it thinks it's node.js, so it looks for the wasm file in the dir that holds the index.html file (and fails to find it). Any advice? Hoping to avoid having to set wasmBinary ... especially as I actually don't have any idea how I'd do that in a |
@paulshapiro, if you're building module just for Electron, then try |
@nazar-pc that's just the thing - our lib is consumed by people using all kinds of different platforms - web, Node.JS, React, etc. We've had a working emscripten build for ~4 yrs but it's getting modernized now. I was thinking this is the use-case for locateFile but my concern is that given there's a |
There was |
@nazar-pc I don't know if you'd want to include it in this PR, but the platform detection code in shell.js really needs an overhaul. Electron should either be it's own platform, or Node should be tested before the browser, or Node shouldn't do |
@nazar-pc @curiousdannii fwiw here's what I'd been using
|
About Electron, #6803 has a proposal for users replacing the "core JS" which is what detects and sets up the environment. That could provide a way to define a custom Electron core JS file. In the shorter term, if Electron is currently blocked, we could add an option to do something like the old |
@kripken understood, thanks. The reason I'm asking about it in this venue is that if the parent directory behavior for file location is contingent upon some set of environment detection functions which are internal to Module, then they're only being used indirectly to trigger certain expected loading functionality in emscripten. That loading behavior may not actually be well defined anymore given that Node can be run in so many contexts. Reproducing the same loading behavior would ostensibly be done by mapping the actual environment to the one which will have the desired loading behavior, but that causes complexity because the environment might not truly be only node. For that reason it leaves me to wonder if feature detection is not a more fitting method in the sense that a desired parent directory or locate file function could be passed in as desired, relieving emscripten of the burden to match environments. |
@paulshapiro, I think these kind of issues should be discussed separately. You can get current behavior of Emscripten after this PR by simply using If there are other issues to be fixed here, let me know and I'll be able to resolve them on weekend, otherwise I think PR is in pretty good shape. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will address those points on weekend
}; | ||
%(EXPORT_NAME)s = %(EXPORT_NAME)s.bind({ | ||
_currentScript: typeof document !== 'undefined' ? document.currentScript : undefined | ||
})%(instantiate)s; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, my idea was to make it just work by default and it does. I'd like to avoid user to require doing anything special just to get this basic thing working, re-implementing locateFile
would be a big pain, especially since it would be necessary to get script's document.currentScript
from the outside.
I'm more inclined to write more tests here rather than dealing with it every time I use Emscripten.
src/shell.js
Outdated
#if !SUPPORT_BASE64_EMBEDDING | ||
#if ENVIRONMENT_MAY_BE_NODE | ||
if (ENVIRONMENT_IS_NODE) { | ||
scriptDirectory = __dirname + '/'; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, will refactor it.
…are moved somewhere and `-s MODULARIZE=1` is used at the same time
@kripken, addressed your last comments in 2 latest commits. |
Great, thanks @nazar-pc! Let's merge this in now. I have a few minor changes, mainly in docs, that I'll do in a followup PR to this one. I just had one question about the code (that can also wait for a followup PR, if we need one) - I see you handle a Thanks again for your patience in this PR - sorry it took this long to get merged. |
I believe there is already a test case for |
Hmm, that 'bind' call for setting _currentSource doesn't appear to do anything. With 1.38.10 with this patch, ogv.js's demuxer modules are broken because this._currentSource is null when checked. #6903 |
Ok, that was breaking because I was calling the module function with "new", which replaces the "this" objected provided by the "bind" with a new one with no properties on it. Fix is to follow current documentation and not use "new" with the module function (it's not a constructor, but an .... initializer?) |
I think this is related to #4542 and, probably, fixes it.
I have compiled a library with
emcc supercop.c #files -o supercop.js -O2 --closure 1 -s WASM=1
and got 2 files in the root of the project:supercop.js
andsupercop.wasm
, everything worked fine so far. However, when I published it to NPM effective path to WebAssembly file becomenode_modules/supercop.wasm/supercop.wasm
andsupercop.js
file was unable to load it.With this patch WebAssembly files that are located next to JavaScript files are loaded correctly both in Node.js and in browser.
I've run some tests and on the surface it looks fine, however, my 12-thread Core i7 is not enough to run all of the tests in any sane amount of time, so... Also I'm a bit terrified with number of tests being runned and do not quite understand how to add new tests. Hopefully someone can augment this PR with new tests.