Support multiple workers for NODEFS /wordpress mounts #2231

brandonpayton · 2025-06-04T03:58:59Z

Motivation for the change, related issues

We want to support concurrent php-wasm workers, but concurrent workers can corrupt the SQLite DB without file locking.

Emscripten's libc provides dummy fcntl() and flock() lock implementations, so the API calls succeed without any locking taking place.

The PR implements:

Advisory file locking with custom fcntl() and flock() functions.
- Currently this is JSPI-only, but we plan to implement broader support in a follow-up PR.
An experimental multiple php-wasm worker feature that can be enabled when a real, shared directory is mounted as the /wordpress dir.

Implementation details

This is a big PR that:

Adds experimental support for multiple php-wasm workers
- Adds an --experimentalMultiWorker arg
- Can be used when JSPI is enabled and a real FS dir is mounted as the /wordpress directory.
- Default worker count is CPU_COUNT - 1.
- Specific number can be passed like --experimentalMultiWorker=7.
- Requests are naively routed to worker with fewest in-progress requests.
- Emscripten mods:
  - We override NODEFS.createNode() to add an isSharedFS flag to all NODEFS nodes. This way we can tell whether file-locking is needed and possible for an FS node, even if wrapped with PROXYFS.
  - We override FS.hashAddNode() to skip caching of isSharedFS nodes. Otherwise, multiple workers will have separate NODEFS caches and can have different, conflicting views of the underlying FS. For example, one may believe a file exists based on its cache, even though the underlying file was deleted by another worker.
Supports both whole-file locks via flock() and byte-range locks via fcntl().
- Our libsqlite3 build uses fcntl() for locking.
- WordPress itself uses the PHP flock() function a small amount (and that function unsurprisingly appears to connect to the platform flock() implementation)
- Before granting a lock to php-wasm, we make sure we hold a native file lock (with the host OS or platform) that is sufficient for the requested lock. We require an exclusive native lock in order to grant any exclusive php-wasm locks. If there are only shared php-wasm locks, we require a native shared lock.
- If we cannot obtain a comparable native lock, we deny the php-wasm lock.
- When a lock is released, we review the remaining php-wasm locks and downgrade the native file lock if its current level is no longer needed. For example, if we hold an exclusive whole-file lock at the OS level but only shared php-wasm locks remain, we downgrade the OS lock to shared.
- When a file descriptor is closed, all its locks are released.
- When a PHP request exits, all its locks are released.
Implements a getpid() override that actually returns the process ID we assign via the @php-wasm/node loader. (Emscripten's getpid() always returns 42 😅)
Adds an --experimentalTrace arg that enables detail tracing messages. Currently, these messages are just for logging, but as long as there are no issues with the trace facility, we could add others.
- In JS, the trace function in Emscripten JS library is a printf style function, js_wasm_trace(format, ...args). The purpose of using the printf style is that no formatting has to take place when js_wasm_trace() is called unless tracing is really enabled.
- In C, the trace function in php_wasm.c is wasm_trace(). It relays its messages to js_wasm_trace().

Testing Instructions (or ideally a Blueprint)

CI
Manual test
- Try all major CLI commands. Bonus: Convert to automated test.
- TODO: instructions to run in multi-worker mode

Adjusts a slight mistake in the condition: ```ts // This returns true if activeSute is undefined since // undefined is different than "none" activeSite?.metadata.storage !== 'none' ``` The goal is to only return `true` from selectSitesLoaded() if we have an active, non-temporary site

…sions (#83) ## Motivation for the change, related issues Our changelog workflow is currently broken because it is missing extra GitHub token secrets. If we add these secrets, we will have to update them occasionally. I think we may be able to do without and would like to try that. In addition, we need to backfill changelog entries while omitting links to PRs (since we cannot link to our private PRs). ## Implementation details This PR: - Updates our changelog update workflow to persist credentials in the local git config and attempts to follow the example of pushing a commit after actions/checkout: https://github.com/actions/checkout?tab=readme-ov-file#push-a-commit-using-the-built-in-token - Omits PR links from new changelog entries unless they point to github.com/WordPress/wordpress-playground - Backfills changelog entries for v1.0.25 to v1.0.29. ## Testing Instructions (or ideally a Blueprint) - Temporarily disable protections on running the workflow - Make the workflow target the PR branch - Manually run the workflow and see if it commits a changelog update to the PR branch - If successful, remove the changelog commit from the PR branch - Re-enable protections - Re-enable checking out trunk only - Merge

## Motivation for the change, related issues Testing Playground CLI with bun (via `npx nx dev playground-cli`) is much faster than building and running via node (via `npx nx start playground-cli`). This PR adds the option to run `@php-wasm/cli` the same way. ## Implementation details This PR adds a `dev` target to the php-wasm-cli project. The `dev` target runs `@php-wasm/cli` using `bun --watch`. ## Testing Instructions (or ideally a Blueprint) - CI - Manually try the new target with `npx nx dev php-wasm-cli "-r 'echo \"huzzah\n\";'"` - Note: For some reason, extra args need to be quoted because they are not escaped when forwarded, at least with our current version of nx and nx:run-commands.

Also add support for Node.js workers

…rker

packages/playground/test-built-npm-packages/commonjs-and-jest/tests/wp.spec.ts

brandonpayton · 2025-06-21T00:55:59Z

Would it be any possible to port it as a C library or so? Or is fs-ext our only option? Either way, with typescript loader for node we're good.

I think the os-lock package will do fine. It doesn't crash bun and does real OS locking for nix and Windows. The API is promised, so I'll need to adjust the declared FileLockManager interface to make the declare methods async. Should be no problem.

The os-lock package seemed to work, but I started seeing failures and stuttering behavior after switching to it. I also discovered that it resolved the Bun crash but then exposed another issue:
Bun's magic doesn't include resolving Vite ?worker&url imports, and those imports are the only way I've been able to keep Vite from inlining the worker script as a data URI starting with data:video/mp2t; (the .ts file extension is also associated with being an "MPEG transport stream" 🤷‍♂️ ).

So I just punted and switched test-built-npm-packages from Bun to Node. It looks like Playground CLI will not be Bun-friendly, at least for the near future (cc @bgrgicak).

Because no changes were required, the lock manager API remains unpromised for now.

If we want folks to be able to run production builds of Playground CLI with Bun, it probably wouldn't be too hard to roll our own addon if we can't find a suitable fs-ext alternative that runs on Bun. The addon is a simple lock/unlock passthrough to platform APIs.

brandonpayton · 2025-06-21T06:28:15Z

I need to step away for the day but plan to resume in the morning. Will see what kind of compromises we can make to keep the big merges moving.

brandonpayton · 2025-06-21T19:54:00Z

I'm making some progress with the test-built-npm-packages tests.

I've been able to run the CommonJS tests with the CLI server cleaning up and not causing hanging, but the tests for ES modules have been apparently conflicting with vitest and tinypool. IIUC, when a Playground worker is terminated, tinypool detects this as an unexpected exit.

To avoid this issue and complexity, I'm working on just using the Node.js test framework which should be simpler and contain fewer surprises. It is working for a single PHP version but crashes when trying multiple PHP versions.

Maybe testing built packages with ES modules for a single PHP version is good enough to merge this PR, especially since the CommonJS tests are testing all supported PHP versions. We could continue debugging this afterward.

Will push my changes after a bit more troubleshooting.

brandonpayton · 2025-06-22T00:19:39Z

The built npm package tests are passing because I switched the ES module tests to a manual test runner that runs one test per process. Without that, the second invocation of runCLI() crashes the test process in both Vitest (regardless of configured pool type) and the builtin Node test runner.

Using a manual test runner with a one-test-per-process approach seems a bit silly to me. But it works around the issue of Workers conflicting with the test runners.

What is left:

Some very recent bug is breaking multi-worker setup (or maybe just my test script)

I will track this down first.
We can also add a multi-worker test to the "unbuilt Playground CLI tests" and maybe to the built package tests. But these could be done in a follow-up PR.

The asyncify sqlite3 tests that are consistently failing
The Playground CLI automated tests that are currently disabled entirely because Vitest doesn't work well with embedded the playground Workers.

@adamziel, if you are still interested in helping with this PR, 2 and 3 are up for grabs at the moment. We could actually fix these things or punt for a short time to enable merging XDebug and Blueprints v2 for Playground.

brandonpayton · 2025-06-22T00:22:35Z

I also haven't started looking at review comments because I've been digging into test failures. IIRC @adamziel said they weren't blockers, but I still intend to address them, even if in a follow-up PR.

…ermination

brandonpayton · 2025-06-22T05:53:13Z

Recent work:

To address issues in the test-built-npm-packages tests:
- Added an async disposal chain from RunCLIServer -> PHPWorker -> PHPRequestHandler -> PHPProcessManager
- Reworked the ES module tests to run a separate process per test.
  - Uses a custom runner script that invokes the tests using the Node.js builtin test framework
  - No test cases were passing with Vitest. There appeared to be errors related to Vitest's use of tinypool, regardless of what pool type was selected.
  - I don't like the custom runner, but it appears to work so we don't have to give up the tests while merging this, XDebug, and the Blueprints v2 PRs.
  - We could do more debugging to find the issue with repeated calls to runCLI() in the same process, but I suspect we have more important things to work on. The most annoying thing about the issue is that it makes deeper unit testing harder. But we could write more scripts to do integration testing using the actual CLI program rather than the runCLI() function.
Extended the unbuilt-Playground-CLI tests to test single-worker Asyncify, single-worker JSPI, and multi-worker JSPI runs.
- It seems like we might have some kind of race going on with multi-worker init, and @adamziel's comment about where we might use bootPHPRequestHandler() instead of bootWordPress() may be a hint about the reason.
- That said, the unbuilt CLI tests run the same multi-worker invocation twice and pass, so I'm not sure what may be going on yet.
Started skipping the asyncify-sqlite3 tests because they appear to be leading to GH Actions crashes.
Left the cli-run.spec.ts tests disabled. There appear to be conflicts between Vitest and the PHP worker threads. We'll need to find a solution to re-enable these.

@adamziel I think we are in a place where we could merge this and then create a follow-up PR to address review comments. I also have an idea about what might be leading to Asyncify fd_close() crashes, so we can see about that as well. What do you think? Should we merge this, merge XDebug and Blueprints v2 support, and refine after?

brandonpayton · 2025-06-22T06:01:15Z

Edited the previous comment to add this line:

Left the cli-run.spec.ts tests disabled. There appear to be conflicts between Vitest and the PHP worker threads. We'll need to find a solution to re-enable these.

adamziel · 2025-06-22T06:09:26Z

Let's merge :)

adamziel · 2025-06-22T06:12:18Z

Also, if vitest is so problematic, we could move to another library entirely (in a follow-up pr)

@brandonpayton

## Motivation for the change, related issues #2231 overrides `FS.hashAddNode` with `function hashAddNodeIfNotSharedFS(node)` where additional logic applies if `is_shared_fs_node(node)` is true. Only NODEFS nodes were supposed to be considered as coming from a shared fs. Unfortunately, the internal logic of `is_shared_fs_node()` also returned true for MEMFS nodes. This caused a FS error 44 for the following operation where `runtime2` attempts to create a directory in a `/wordpress` directory mounted from `runtime1`: ```ts import { loadNodeRuntime } from "@php-wasm/node"; import { getLoadedRuntime } from "@php-wasm/universal"; const opts = { emscriptenOptions: { ENV: { DOCROOT: '/wordpress' } } }; const runtime1 = getLoadedRuntime(await loadNodeRuntime('8.3', opts)); runtime1.FS.mkdir("/wordpress"); const runtime2 = getLoadedRuntime(await loadNodeRuntime('8.3', opts)); runtime2.FS.mkdir("/wordpress"); runtime2.FS.mount( runtime2.PROXYFS, { root: '/wordpress', fs: runtime1.FS }, '/wordpress' ); // This works: // runtime1.FS.mkdir("/wordpress/wp-content"); // This doesn't: runtime2.FS.mkdir("/wordpress/wp-content"); ``` Specifically, the FS error 44 was triggered inside `is_shared_fs_node()` when calling NODEFS operations on these non-NODEFS nodes. ## Implementation details Adds a check confirming the shared node comes from NODEFS. ## Testing Instructions (or ideally a Blueprint) * Confirm the reproduction above works without errors. * Once #2285 lands, we'll be able to add a unit test cc @brandonpayton

brandonpayton

I grabbed the unresolved concerns from this PR and added them to the follow up issue here:
#2293

Planning to work on that issue next.

packages/php-wasm/node/src/test/php-asyncify-sqlite3.spec.ts

packages/php-wasm/node/src/lib/file-lock-manager-for-node.ts

packages/php-wasm/node/src/test/file-lock-manager-for-node.spec.ts

packages/php-wasm/node/src/lib/file-lock-manager-for-node.ts

packages/php-wasm/compile/php/php_wasm.c

packages/php-wasm/compile/php/phpwasm-emscripten-library-file-locking-for-node.js

packages/playground/cli/src/cli.ts

packages/playground/cli/src/test/cli-run.spec.ts

packages/php-wasm/node/src/test/file-lock-manager-for-node.spec.ts

brandonpayton and others added 30 commits March 12, 2025 14:33

Begin shared/writer file locking implementation

fc589cb

Implement F_SETLK using cross-platform flock()

edf4ed9

Call builtin fcntl implementation from php_wasm.c

203b1ec

Simplify async syscall

968aaaa

Remove unnecessary stdarg.h include

8601c32

WIP: Stub a FileLockManager that will later be shared among workers

87238a3

Fix system call varargs reading

e00e617

Avoid type error with default switch case

f485a8a

Merge branch 'trunk' into add-fcntl-for-nodejs

1288e81

Move comlink API helpers from php-wasm web to universal

feff197

Also add support for Node.js workers

Fix comlink API helpers to work for both node and web

72bb46c

Fix php-wasm/web exports of comlink API helpers

1de9f84

Move to PHP worker thread

2448d15

Remove unused import

24c9112

Adjust consumeAPI signature to support exposing API from parent to wo…

7fc9c9f

…rker

Make file lock manager node-specific for now

334a6c5

Bump comlink dep to try to resolve apparent listener leak

1d9f94a

Allow exposing API to specific endpoint

ad26310

Stop exporting non-existing types from php-wasm/universal

13b040a

Include package-lock.json from comlink upgrade

0daddcc

Expose file lock manager to worker

38f4455

Hack together a boot method for secondary workers

f4e6f18

Load balance requests between multiple workers

9319e9b

Add note about broken secondary worker request routing

6a3d9cf

Remove import of nonexistent type

d1e82d2

Remove some cruft and a pointless comment

433bcc3

Add a couple of notes

80ebb33

brandonpayton commented Jun 21, 2025

View reviewed changes

packages/playground/test-built-npm-packages/commonjs-and-jest/tests/wp.spec.ts Show resolved Hide resolved

brandonpayton added 5 commits June 21, 2025 00:20

Make PHPWorker asyncDisposable and use in testing built npm packages

dabf3da

Fix remote Playground disposal

5214b63

Fix let-should-be-const lint error

666c2c8

Avoid undefined-null-to-object error in PHP.exit

a4a116d

Terminate all workers when CLI disposed

d8a75c3

Fix test-built-npm-packages tests

8d515dc

brandonpayton added 6 commits June 22, 2025 00:06

Only log resolved WP release if downloading WP

e948fd1

Test unbuilt multi-worker

a886826

Skip the sqlite-asyncify tests until we solve unexpected GH Actions t…

04fd547

…ermination

Fix Node version check in unbuilt-jspi target

26dc841

Restore package install in unbuilt CLI test

f51eb69

Merge branch 'trunk' into add-fcntl-for-nodejs

463f820

brandonpayton merged commit ff727fb into trunk Jun 22, 2025
73 of 75 checks passed

brandonpayton deleted the add-fcntl-for-nodejs branch June 22, 2025 17:07

brandonpayton mentioned this pull request Jun 22, 2025

Follow-up on "Support multiple workers for NODEFS /wordpress mounts" #2293

Open

24 tasks

adamziel mentioned this pull request Jun 24, 2025

PHP Node: Only consider NODEFS to be a shared filesystem #2300

Merged

brandonpayton mentioned this pull request Jun 27, 2025

Relay correct source paths in DWARF debug info #2303

Merged

brandonpayton commented Jul 1, 2025

View reviewed changes

adamziel mentioned this pull request Jul 5, 2025

Should we support inter-process file locking? emscripten-core/emscripten#23697

Open

Support multiple workers for NODEFS /wordpress mounts #2231

Support multiple workers for NODEFS /wordpress mounts #2231

Conversation

brandonpayton commented Jun 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation for the change, related issues

Implementation details

Testing Instructions (or ideally a Blueprint)

Uh oh!

Uh oh!

brandonpayton commented Jun 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

brandonpayton commented Jun 21, 2025

Uh oh!

brandonpayton commented Jun 21, 2025

Uh oh!

brandonpayton commented Jun 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

brandonpayton commented Jun 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

brandonpayton commented Jun 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

brandonpayton commented Jun 22, 2025

Uh oh!

adamziel commented Jun 22, 2025

Uh oh!

adamziel commented Jun 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

brandonpayton left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

brandonpayton commented Jun 4, 2025 •

edited

Loading

brandonpayton commented Jun 21, 2025 •

edited

Loading

brandonpayton commented Jun 22, 2025 •

edited

Loading

brandonpayton commented Jun 22, 2025 •

edited

Loading

brandonpayton commented Jun 22, 2025 •

edited

Loading

adamziel commented Jun 22, 2025 •

edited

Loading