-
Notifications
You must be signed in to change notification settings - Fork 37
Return file creation times on Linux (using statx) #248
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Rust's standard library returns file creation times on Linux (at least with glibc), and now rust-std will too. File creation times are available on Linux via the `statx` syscall (not `fstatat`), introduced in kernel 4.11, which was released in April 2017. The Rust standard library already uses `statx` on Linux with glibc. Making cap-std support file creation times on Linux required two changes: 1. `File::metadata()` uses the standard library (which uses `statx`), but this wouldn't ever copy over the created field on Linux. Now, any time the created field is set in the std Metadata struct, it's also set in the cap-primitives Metadata, regardless of platform. 2. `stat_unchecked` is used in several places, including fetching DirEntry metadata. Before, it called `fstatat` directly. Now, it calls `statx` on Linux when available, and it falls back to `fstatat` otherwise. Fortunately, Dan Gohman (@sunfishcode) had already added a method to convert `statx` results in commit d1fa735 (PR bytecodealliance#105) in 2020. This commit also adds a new test to make sure file creation times are set in cap-std Metadata if they are set in std Metadata.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, thanks for working on this!
tests/metadata-ext.rs
Outdated
let std_supports_created = matches!( | ||
a.into_std().metadata(), | ||
Ok(m) if m.created().is_ok(), | ||
); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of using the modified()
timestamp for expected
, could you use the created()
timestamp for expected
? And for the directories, could we always call std::fs::metadata
so that we always get the created()
timestamp from std? It seems like that should eliminate the need for a tolerance, because the creation time from cap-std should be exactly the same as the creation time from std.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, that makes more sense than what I had done. Updated.
tests/metadata-ext.rs
Outdated
@@ -83,3 +83,55 @@ fn test_metadata_ext() { | |||
); | |||
} | |||
} | |||
|
|||
#[test] | |||
fn test_metadata_ext_created() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this test in the right / a reasonable place? I wasn't sure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, good question. metadata-ext.rs is for MetadataExt
APIs, but the created()
field is a regular Metadata
API. This can go in fs_additional.rs
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated.
); | ||
match statx_result { | ||
Ok(statx) => return Ok(MetadataExt::from_rustix_statx(statx)), | ||
Err(rustix::io::Error::NOSYS) => HAS_STATX.store(false, Ordering::Relaxed), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I saw a comment in open_impl.rs
about seccomp on Android making ENOSYS a bad idea. I don't think that applies here yet, but it might later. Is there a better approach? Should I place a comment on this landmine?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rust's own standard library calls statx
without doing an explicit version check, so we're ok on Androidd here.
Looking into it though, I notice that there is a potential problem with statx
returning EPERM
on old versions of Docker; std does a special check to avoid this problem. If you want to implement that same trick here, that'd be great, though since rustix also needs to do this, it's also fine to leave this code as-is for now, and I'll handle this when I update rustix.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice catch. It seems like if we merge this PR now, it'd break metadata for not-very old versions of Docker, so that's probably bad.
Why/where does rustix need this logic? I thought it just wrapped statx
and doesn't really care whether or not statx
succeeds.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rustix uses statx
in its implementation of rustix::fs::stat
. Rustix usually doesn't do that kind of emulation, but in this case, it's motivated by a desire to be y2038-ready on platforms which support that. On Linux, that means using statx
instead of the native stat
family of syscalls whenever possible. So if you want to fix this now, that'd be great, but I am also happy to fix this myself at the same time that I fix rustix.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I didn't see that earlier because I was searching through an older version, oops.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just simulated the seccomp statx -> EPERM behavior on my system using this Python script (as root):
import errno
import os
import seccomp
f = seccomp.SyscallFilter(seccomp.ALLOW)
f.add_rule(seccomp.ERRNO(errno.EPERM), "statx")
f.load()
os.execl(
"target/debug/deps/fs_additional-d7e3e72f4c824747",
"fs_additional",
"--nocapture",
"--test-threads=1",
"metadata_vs_std_fs",
)
It's not quite as accurate as getting an old Docker version, but I'd rather not do that :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I updated stat_unchecked
to handle EPERM
. I couldn't do the same EFAULT
trick as Rust std
uses (I think it'd require unsafe), but I pushed the test off the critical path. Let me know what you think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm considering adding something in Rustix to allow or the null-pointer EFAULT
trick, but for now the code you have here looks good.
@@ -3,6 +3,14 @@ use rustix::fs::{statat, AtFlags}; | |||
use std::path::Path; | |||
use std::{fs, io}; | |||
|
|||
// TODO: update all these to | |||
// #[cfg(any(target_os = "android", target_os = "linux"))] | |||
// once we're on restix >= v0.34.3. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any idea when the rustix version will be updated? I don't have a sense of how big that is, so I didn't want to add it to this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The biggest thing for cap-std is that there are some semver incompatibilities between versions of io-lifetimes, which is tracking changes from I/O safety stabilization. Fortunately, the FCP looks to be almost done. Once it's done, I plan to do an io-lifetimes release and then a rustix 0.35 release, and then update cap-std and various other things.
|
||
let std_dir = check!(dir.into_std_file().metadata()); | ||
let std_file = check!(file.into_std().metadata()); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test is much nicer now, I think. However, it's a little funny as is. It does all this setup to extract metadata in various ways from cap_std and std, but then it only compares the created times of the metadata and nothing else. Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On one hand, creation times have an interesting enough implementation landscape that I feel it's justified to have a fair amount of test setup just for that one purpose, so this seems fine. On the other, since we have all this setup here, it'd be nice to expand this to test all of Metadata
's accessors to make sure they all match. accessed()
might need to be a >=
check instead of equality because files can get spurious access-time updates.
I don't want to bog you down if you're working on a project which just needs the creation times fix. So I'll leave it up to you for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I expanded this out for all of the accessible Metadata (with only Unix extensions for now).
// The standard library returns file format bits with `mode()`, whereas | ||
// cap-std currently doesn't. | ||
assert_eq!(std.permissions().mode() & 0o7777, cap.permissions().mode()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was unexpected and a bit surprising to me. Is it intentional? If so, is that documented somewhere?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's cap-std's intention to match std's behavior at least, so it looks like a bug in cap-std. I filed #249 to track this.
// The access times might be a little different due to either our own | ||
// or concurrent accesses. | ||
const ACCESS_TOLERANCE_SEC: u32 = 60; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still don't love this but didn't want to assume we made the cap or std access first, and I wanted to check the access times weren't wildly far into the future either. I guess it's pretty easy to find and update later if it causes trouble.
I had used `cfg!(unix)` instead of `#[cfg(unix)]` in 2eca023, which doesn't conditionally compile out the Unix-specific code.
Thanks for all your guidance with this, @sunfishcode, especially since there's a lot of relevant context that I wasn't aware of when I started. I think the code improved with each iteration, and it's nice that the expanded test caught another issue right away. Cheers. |
This is now released in cap-std 0.25. |
Rust's standard library returns file creation times on Linux (at least with glibc), and now cap-std will too.
File creation times are available on Linux via the
statx
syscall (notfstatat
), introduced in kernel 4.11, which was released in April 2017. The Rust standard library already usesstatx
on Linux with glibc.Making cap-std support file creation times on Linux required two changes:
File::metadata()
uses the standard library (which usesstatx
), but this wouldn't ever copy over the created field on Linux. Now, any time the created field is set in the std Metadata struct, it's also set in the cap-primitives Metadata, regardless of platform.stat_unchecked
is used in several places, including fetching DirEntry metadata. Before, it calledfstatat
directly. Now, it callsstatx
on Linux when available, and it falls back tofstatat
otherwise. Fortunately, Dan Gohman (@sunfishcode) had already added a method to convertstatx
results in commit d1fa735 (PR Fixstat
when the last path component is a symlink to..
. #105) in 2020.This commit also adds a new test to make sure file creation times are set in cap-std Metadata if they are set in std Metadata.