-
Notifications
You must be signed in to change notification settings - Fork 3.4k
Use the NODEFS filesystem in statfs when available #23912
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use the NODEFS filesystem in statfs when available #23912
Conversation
Can you give an example of the behavior that you are trying to fix? |
Right, if you could update #if NODEFS || NODERAWFS
// In nodefs and noderawfs assert that the values returned are correct.
// In wasmfs and memfs, `fstatfs` is a stub so maybe the values there are expected to be wrong?
#endif |
I'm running into this issue in WordPress Playground. When PHP calls
I would love to add a test, but I'm not sure how to do it in Emscripten. In the test each FS will return the default values. I checked it by adding To properly test NODEFS and NODERAWFS I would need to have real OS data in the response, but I'm not sure how test that.
The issue doesn't come up in NODERAWFS because when used it's the default FS, unlike NODEFS which isn't the default. |
I see |
I definitely need to research this more. |
I will get back to this PR next week. After I have the test, I will explore how it could be improved. |
src/lib/libfs.js
Outdated
// file system which doesn't have a statfs function. | ||
// To ensure statfsNode has access to the statfs function, we pass in | ||
// the NODEFS object. | ||
if (this.filesystems.NODEFS) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
statfs
should figure out which filesystem the path is in and then return the filesystem info for that filesystem.
i.e. if path
is in the MEMFS you should get the memfs info. If path
is a NODEFS you should get the node FS info.
The presense of NODEFS alone should not be a factor, right? The question should be "is path part of a node FS mount" I think,
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've been thinking about this for a week and I'm not sure how to handle it, so let me start with our usecase and why the above approach doesn't work for us.
WordPress Playground mounts all files into MEMFS because some files like the WordPress code are being downloaded from remote URL while a plugin might be imported from a local folder. So, the resulting codebase that we are running inside Emscription is composed out of multiple data sources.
Curently when the code asks for available disk space using statfs("/")
it get's the default values because /
is a MEMFS directory.
I was able to work around this by overriding statfs
during onRuntimeInitialized
.
PHPRuntime.FS.root.node_ops = {
...PHPRuntime.FS.root.node_ops,
statfs: PHPRuntime.FS.filesystems.NODEFS.node_ops.statfs,
};
PHPRuntime.FS.root.mount.opts.root = '/';
Checking what FS the path belongs to, makes sense overall, but in some cases like the above it would be great if we could override it and force a FS to be used.
Specifically for statfs
this makes sense to force NODEFS, because it only works in NODEFS (and NODERAWFS).If the requested path is available in that NODEFS it would be reasonable to override MEMFS and return the result from NODEFS.node_ops.statfs.statfs
.
What do you think? How would you approach our usecase?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Curently when the code asks for available disk space using
statfs("/")
it get's the default values because/
is a MEMFS directory.
Which code is doing this? Asking for the available disk space by using statfs("/")
seem odd since normally the root FS has very little space in it right? Where is that code that is doing this? Why is it using /
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But won't the statfs
from node give very different results if you stat /
vs /tmp
vs /home
, for example. What is the code in question interested in exactly?
} catch (e) { | ||
// Older versions of Node don't support statfsSync | ||
} | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this is only needed for the one test I would just do this in fstatfs.c
itself. You can just do it in an EM_ASM or EM_JS block on startup?
I found a way to work around this issue in our project. The root cause of the issue is that if I run To work around this I used the The resulting solution looks similar to this:
|
The way Emscripten handles the root path ( It would be reasonable to always force the use of |
But why you expect The root fs is normally not very big, and so its not a good proxy for how much free space there is, for example, in writable locations such as |
This might be specific to our use case. Playground allows you to run WordPress sites in WASM and it needs to work just as if you installed Apache/Nginx on your machine. This specific case could be resolved by us using NODERAWFS, but we need MEMFS for some Playground features, so NODERAWFS isn't an option for us.
In our case, |
But you are aware that it returns the info on the root partition only right? This partition is not normally writable by normal users whos data often lives in a separate |
How do you mean info on the root partition only? From what I understand fs.statfsSync returns the disk space of the underlying volume, so both /home and /tmp will return the same result in case they are on the same volume. |
Right, but for most/many UNIX/linux machines the root volume |
That's ok. This is why my initial suggestion was to use the NODEFS |
## Motivation for the change, related issues This PR adds `statfs` support to the Emscripten root file system to enable PHP-wasm Node to get real filesystem stats for the `/` path. Currently, in PHP-wasm Node, PHP functions like `disk_total_space('/')` return the default hardcoded value for MEMFS instead of the actual disk space. This happens because Emscripten automatically detects the filesystem for a given path, and because the root path always uses the MEMFS filesystem, Emscripten will use the MEMFS `statfs` implementation. PHP-wasm Node includes two Emscripten filesystems, MEMFS and NODEFS. MEMFS doesn't have access to the operating system's filesystem, so it can only return a hardcoded value. NODEFS has access to the OS filesystem through the `fs` module, and specifically for `statfs` it uses `fs.statfsSync` to get real data. ## Implementation details This PR uses Emscripten's `onRuntimeInitialized` event to override the `statfs` implementation in the Emscripten root FS with the NODEFS implementation of `statfs`. It also defines the `FS.root.mount.opts.root` path as `.` to ensure the root node has a path value defined. Otherwise when we call `FS.statfs('/')` it would use the root node's root path which is `undefined` and this would throw an error because it would call `fs.statfsSync(undefined)` in the NODEFS implementation of `statfs`. I explored fixing this [issue in Emscripten](emscripten-core/emscripten#23912), but I wasn't able to find a working solution there, so I ended up patching it in Playground. ## Testing Instructions (or ideally a Blueprint) - CI
🚧 WIP
statfs
always returns default values with NODEFS, because the default file system is MEMFS which doesn't supportstatfs
.In this PR we detect if NODEFS is available and use it as the node when calling
statfsNode
.This ensures
statfsNode
has access to NODEFS and the requested path.statfs
is only supported in NODEFS and NODERAWFS, but this change won't affect NODERAWFS because when used it's the default file system and it has it's own implementation of statfsNode.Testing instructions
I tested this by running
./test/runner other.test_unistd_fstatfs
, but that only ensures we are getting values, the test doesn't check for real values.