Skip to content

4.0.0 regression: Module.postRun callbacks not being called #23420

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
sgbeal opened this issue Jan 15, 2025 · 13 comments
Closed

4.0.0 regression: Module.postRun callbacks not being called #23420

sgbeal opened this issue Jan 15, 2025 · 13 comments

Comments

@sgbeal
Copy link

sgbeal commented Jan 15, 2025

While attempting to get SQLite building with Emscripten 4.0.0 we've discovered that our --extern-post-js code is failing because initialization which has formerly been done via Module.postRun is no longer being run (or is perhaps being run later than it historically did and we can't get to that point because our --extern-post-js requires the prior ordering).

This is most easily demonstrated with some console debug output comparing 3.1.74 (the last 3.1 release) and 4.0.0:

image

The relevant part is the 3rd yellow line on the left. That line, from the start of a Module.postRun handler, is never hit in 4.0.0.

i've not found any relevant change mentioned in recent change logs.

Adding this silly workaround to our extern-post-js code, from the then() handler of calling the EXPORT_NAME-defined module init function, works around the problem:

      if( EmscriptenModule.postRun && EmscriptenModule.postRun.length ){
        /* Emscripten 4.0.0 does not run postRun(). */
        console.warn("Emscripten did not run postRun: running them now!");
        EmscriptenModule.postRun.shift()(EmscriptenModule);
      }

Which makes it now behave like 3.1.74 except for the later timing of postRun():

image

@sbc100
Copy link
Collaborator

sbc100 commented Jan 15, 2025

I believe this is known consequence of our transition to usage of async/await. See #23157 and the corresponding ChangeLog entry:

emscripten/ChangeLog.md

Lines 69 to 74 in 9aa4df6

- The factory function exposed in `-sMODULARIZE` mode is now marked as `async`
when `WASM_ASYNC_COMPILATION` is enabled (the default). This allows us to use
`await` during module creation. One side effect of this is that code in
`--post-js` files will now be delayed until after module creation and after
`main` runs. This matches the existing behaviour when using sync instantation
(`-sWASM_ASYNC_COMPILATION=0`) but is an observable difference. (#23157)

Basically, because we can now use async/await inside the module constructor factor function we can now await the internal createWasm function then allows the run() function to actually run stuff rather than deferring it.

The --post-js code has always appeared in the code after the run() function but previously the run() was being deferred (at least for users of async instantiation). Users who use synchronous instantiation have always had this behaviour. The difference here is that async/await allows us to do the same for async instantiation removing the observable difference between sync and async instantiation in this case.

Having said all that my understanding is that this applies to --post-js code and the --extern-post-js code is not effected by this because it runs outside of the module factory function.

Are you using --post-js at all? How/where are you adding your postRun?

@sgbeal
Copy link
Author

sgbeal commented Jan 15, 2025

I believe this is known consequence of our transition to usage of async/await.

That looks to fit the symptoms precisely. This breaks the late-stage init timing for us, but it's easy to work around now that we know about it.

Are you using --post-js at all? How/where are you adding your postRun?

We rely heavily on both --post-js and --extern-post-js. Our post-js code is in fact what adds the postRun:

$ cat post-js-header.js 
<docs snipped>
if(!Module.postRun) Module.postRun = [];
Module.postRun.push(function(Module/*the Emscripten-style module object*/){
  'use strict';
<EOF>

Our build process emits that, followed by a dozen-odd files, then the following block:

/* The current function scope was opened via post-js-header.js, which
   gets prepended to this at build-time. This file closes that
   scope. */
})/*postRun.push(...)*/;

That bundle then gets passed to --post-run.

Thank you for explaining the issue.

@sgbeal sgbeal closed this as completed Jan 15, 2025
@sgbeal
Copy link
Author

sgbeal commented Jan 15, 2025

Having said all that my understanding is that this applies to --post-js code and the --extern-post-js code is not effected by this because it runs outside of the module factory function.

BTW: that's likely why we're seeing this. We have to, from extern-post, replace the module-provided init function with our own (see #18071) and call the original init func ourself.

@sbc100
Copy link
Collaborator

sbc100 commented Jan 15, 2025

Would it be possible to switch to --pre-js where registering postRunis always going to work?

@sgbeal
Copy link
Author

sgbeal commented Jan 15, 2025

Would it be possible to switch to --pre-js where registering postRun is always going to work?

Good question. Next question? ;)

IIRC (which i won't guaranty - it's been 2 years since i went down this rabbit hole), #18071 is the reason we cannot(?) do what we need to do in --pre-js. In short: we need to store some global state which is relevant for when the module init function (defined via EXPORT_NAME) is called by the client, but we cannot access the module init function from --pre-js because the EXPORT_NAME function is anonymous at that point:

var sqlite3InitModule = (() => {
  var _scriptName = typeof document != 'undefined' ? document.currentScript?.src : undefined;

To do what we want to do, we have to have access to the sqlite3InitModule object but in --pre-js it is still in the process of being assigned to, so is not available to us, and it has no name of its own, so we we cannot access it by name:

var xyz = (()=>{xyz.a = 1})()  // --> Uncaught TypeError: xyz is undefined

If that function had a well-defined name (as #18071 requests), we could (IIRC) move our extern-JS code into --pre-js, but it doesn't, so we have nowhere to attach our required state to. So we delay that part of our init until --extern-post-js, where we can see sqlite3InitModule.

That said: i'm going from memories 2+ years old and may well be misremembering.

@sbc100
Copy link
Collaborator

sbc100 commented Jan 15, 2025

While the code in the pre-js file is included before the module is fully constructed the postRun function themselves do have full access to the module via its argument. e.g:

// This code is included before the module is full constructed.
Module.postRun.push((MyModule) => {
   // This code is run with the full-constructed module as its first argument
});

Not sure if this helps in your case.

@sgbeal
Copy link
Author

sgbeal commented Jan 15, 2025

// This code is included before the module is full constructed.

The Module object is, but we need to completely replace the function specified by EXPORT_NAME (see below). AFAIK, extern-js is the only place we have access to the function named by EXPORT_NAME. The --post-js code is run inside of the being-initialized EXPORT_NAME function, so does not have access to that function (because of #18071, hint, hint ;).

The reason we want the EXPORT_NAME object is so that we can replace it so that clients cannot see the Emscripten module object: the public API has no Emscripten visibility and we want the init function to work identically for hypthetical non-Emscripten frameworks. We achieve that by replacing the EXPORT_NAME function to call the Emscripten-defined EXPORT_NAME and then return the sqlite3 API object instead of the Emscripten module.

Relevant code from our --extern-js, noting that (A) sqlite3InitModule is the name passed to EXPORT_NAME and (B) the //#if and //#else parts are used by a custom code preprocessor:

  /**
     In order to hide the sqlite3InitModule()'s resulting
     Emscripten module from downstream clients (and simplify our
     documentation by being able to elide those details), we hide that
     function and expose a hand-written sqlite3InitModule() to return
     the sqlite3 object (most of the time).
  */
  const originalInit = sqlite3InitModule;
  if(!originalInit){
    throw new Error("Expecting globalThis.sqlite3InitModule to be defined by the Emscripten build.");
  }
  /**
     We need to add some state which our custom Module.locateFile()
     can see, but an Emscripten limitation currently prevents us from
     attaching it to the sqlite3InitModule function object:

     https://github.com/emscripten-core/emscripten/issues/18071

     The only(?) current workaround is to temporarily stash this state
     into the global scope and delete it when sqlite3InitModule()
     is called.
  */
  const initModuleState = globalThis.sqlite3InitModuleState = Object.assign(Object.create(null),{
   <...big snip...> });

  globalThis.sqlite3InitModule = function ff(...args){
    return originalInit(...args).then((EmscriptenModule)=>{
      if( EmscriptenModule.postRun && EmscriptenModule.postRun.length ){
        /* Emscripten 4.0.0 changes the order in which our Module.postRun handler
           runs. In 3.x postRun would have run by now, and our code relies
           heavily on that order, so we'll work around that difference here.

           https://github.com/emscripten-core/emscripten/issues/23420 */
        EmscriptenModule.postRun.shift()(EmscriptenModule);
      }
//#if wasmfs
      if('undefined'!==typeof WorkerGlobalScope &&
         EmscriptenModule['ENVIRONMENT_IS_PTHREAD']){
        /** Workaround for wasmfs-generated worker, which calls this
            routine from each individual thread and requires that its
            argument be returned. The conditional criteria above are
            fragile, based solely on inspection of the offending code,
            not public Emscripten details. */
        //console.warn("sqlite3InitModule() returning E-module.",EmscriptenModule);
        return EmscriptenModule;
      }
//#endif
      const s = EmscriptenModule.sqlite3;
      s.scriptInfo = initModuleState;
      //console.warn("sqlite3.scriptInfo =",s.scriptInfo);
      if(ff.__isUnderTest) s.__isUnderTest = true;
      const f = s.asyncPostInit;
      delete s.asyncPostInit;
      return f();
    }).catch((e)=>{
      console.error("Exception loading sqlite3 module:",e);
      throw e;
    });
  };
  globalThis.sqlite3InitModule.ready = originalInit.ready;

@sbc100
Copy link
Collaborator

sbc100 commented Jan 15, 2025

Indeed if you want somehow replace the EXPORT_NAME function (the module factory function) then --extern-post-js would indeed be the way to go.

But I though we were talking about --post-js and the use of Module.postRun within that --post-js code? Is that issue now solved?

@sgbeal
Copy link
Author

sgbeal commented Jan 15, 2025

Is that issue now solved?

It's worked around sufficiently, IMO: if our replacement init function sees that there are still postRun items in the queue, it runs them and removes them from the queue so Emscripten doesn't run them again. We only have one postRun entry, but it contains and initializes the whole library.

@sbc100
Copy link
Collaborator

sbc100 commented Jan 15, 2025

Add to Module.postRun after the postRun function has run seems like a strange thing to do. See

emscripten/src/preamble.js

Lines 267 to 282 in 4c14f1f

function postRun() {
#if STACK_OVERFLOW_CHECK
checkStackCookie();
#endif
#if PTHREADS
if (ENVIRONMENT_IS_PTHREAD) return; // PThreads reuse the runtime from the main thread.
#endif
#if expectToReceiveOnModule('postRun')
if (Module['postRun']) {
if (typeof Module['postRun'] == 'function') Module['postRun'] = [Module['postRun']];
while (Module['postRun'].length) {
addOnPostRun(Module['postRun'].shift());
}
}
#endif
. I think that is what you are doing if you are adding to Module.postRun in your --post-js file.

A better solution here might just be execute the code directly since #23157 ensures (in most cases) that postRun as already been run by the time --post-js code runs. If you want to be extra carefull you could handle both cases:

if (runtimeInitialized) {
   // The module has already been initialized and postRun's have been processed already.  Just callFoo directly
   callFoo();
} else {
   // The module has not yet initialized and postRun has not yet been called
   Module.postRun.push(callFoo);
}

Or maybe even simpler just do:

// With modern version of emscripten we can expect the module to be initialized here.
assert(runtimeInitialized);
callFoo();

@sbc100
Copy link
Collaborator

sbc100 commented Jan 15, 2025

One reason I'm suggesting you avoid using Module.postRun after postRun is that I was thinking of making that into an error, so that other users don't run into this issue as you have.

@sgbeal
Copy link
Author

sgbeal commented Jan 16, 2025

Add to Module.postRun after the postRun function has run seems like a strange thing to do. See

i wasn't aware that postRun() had already run at that point. It's always "just worked" for us (until 4.0.0). Now that you mention it, though... we can probably remove the usage of Module.postRun altogether and defer that bootstrapping code until our overridden EXPORT_NAME is run, since we have to do that for 4.0 anyway.

assert(runtimeInitialized);

We can only do that from post-js, not extern-post-js, and we can't access the EXPORT_NAME symbol from post-js (#18071 again!), so we have to defer the final init of our module until extern-post, where we can override the EXPORT_NAME function. Even so... we use Module.postRun because it was a suitable fit at the time and didn't cause us any grief, but it now might make sense to move that into a function which we call from our overridden EXPORT_NAME (in export-post-js).

i'll need to tinker with that and figure out the ramifications.

One reason I'm suggesting you avoid using Module.postRun after postRun is that I was thinking of making that into an error, so that other users don't run into this issue as you have.

Since the Module.postRun array is consumed as it's traversed, running postRun() multiple times seems pretty harmless.

If we move our main lib-level init call into our overridden EXPORT_NAME(), that should give us the same effect as manually triggering postRun(), so i'll look into making that change. That would, in any case, eliminate yet one more dependency on the Module object, which would please me :) (no, it wouldn't - it would just move it around a bit).

@sbc100
Copy link
Collaborator

sbc100 commented Jan 16, 2025

One reason I'm suggesting you avoid using Module.postRun after postRun is that I was thinking of making that into an error, so that other users don't run into this issue as you have.

Since the Module.postRun array is consumed as it's traversed, running postRun() multiple times seems pretty harmless.

The problem I'd like to solve is folks adding to Module.postRun after runtime has been intialized. In that case the postRun that they add is silently ignored. I believe that is the original issue in the bug, and I'm hoping we can find a way to assert in that case.

sbc100 added a commit to sbc100/emscripten that referenced this issue Feb 8, 2025
We had some bug reports recently when we switched to using async/await
in MODULUARIZE mode.  One of the side effects was the `--post-js` code
then runs after module instantiation and startup, which means assignment
of properties like `onRuntimeInitialized` is too late if it happens in
`--post-js`.

This was always the case for sync instantiation too.

We now detect this too-late-assignment at abort with a useful error.

See emscripten-core#23626 and emscripten-core#23420
sbc100 added a commit to sbc100/emscripten that referenced this issue Feb 8, 2025
We had some bug reports recently when we switched to using async/await
in MODULUARIZE mode.  One of the side effects was the `--post-js` code
then runs after module instantiation and startup, which means assignment
of properties like `onRuntimeInitialized` is too late if it happens in
`--post-js`.

This was always the case for sync instantiation too.

We now detect this too-late-assignment at abort with a useful error.

See emscripten-core#23626 and emscripten-core#23420
sbc100 added a commit to sbc100/emscripten that referenced this issue Feb 8, 2025
We had some bug reports recently when we switched to using async/await
in MODULUARIZE mode.  One of the side effects was the `--post-js` code
then runs after module instantiation and startup, which means assignment
of properties like `onRuntimeInitialized` is too late if it happens in
`--post-js`.

This was always the case for sync instantiation too.

We now detect this too-late-assignment at abort with a useful error.

See emscripten-core#23626 and emscripten-core#23420
sbc100 added a commit that referenced this issue Feb 11, 2025
We had some bug reports recently when we switched to using async/await
in `MODULUARIZE` mode.  One of the side effects was the `--post-js` code
then runs after module instantiation and startup, which means assignment
of properties like `onRuntimeInitialized` is too late if it happens in
`--post-js`.

This was always the case for sync instantiation mode.

When assertions are enabled, we now detect this too-late-assignment and
abort
with a useful error.

Code size regression in this PR, as expected, only for debug builds.

See #23626 and #23420
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants