Conversation
|
Ooh, this will have to be rebased on top of #1. I am happy to do that! But since it will involve changing stuff sort of in the middle of the series, I'm going to wait until you review what's here, and then if you have any feedback, I can handle both the rebase and the feedback at the same time. |
|
(After reading over that code, I have some stuff that will make your life easier over there too. But again, I'll wait for that.) |
anonrig
left a comment
There was a problem hiding this comment.
Amazing work. Thank you for your feedback. I've had some questions, but greatly happy to see this. One of my last question is, how the usage of .context is only exist in cli and not in other crates. Should they share the same behavior?
|
I replied to a few things but I have to run; I'll put more later and update the PR as well. |
|
Left more comments, I am going to rebase and add more commentary on the stuff from #1 tomorrow. |
|
I love open source! |
Previously, this code said "Hey, I will take anything that can be
converted into a &Path," and now it says "Hey I can take anything that
can be turned into a PathBuf."
The former is not *wrong*, but is a bit roundabout. You want to store a
PathBuf. Just take anything that can be turned into a PathBuf. Not only
is this a little simpler, but it also can save you an allocation, at
least conceputally, I don't know if rustc is smart enough to remove the
allocation in code like this:
```rust
let path = PathBuf::from("/etc/passwd");
let manager = RegistryManager::new(&path);
```
With the previous version of the code, this would convert a PathBuf into
a &Path, by using a pointer to it, and then creates a whole new PathBuf
out of it, by allocating. This way instead, we can still take the same
stuff, but if we have a PathBuf already, we skip the roundtrip through a
pointer. In other words, you could just pass `path` in the code above,
whereas you couldn't with the original version.
The *only* thing &String can meaningfully do over &str is examine the capacity of the underlying string, and you aren't doing that. This signature is more general, and more idiomatic.
This code works via a feature called "Deref coercion." Basically: If T implements Deref<Target = U>, and x is a value of type T, then &T is coerced into &U. This means we can pass a &String to a function expecting a &str, and it will Just Work, because String implements Deref<Target=str>. This is also why the previous commit works and is more general; maybe I should have said this there!
format strings are a relatively new addition to Rust, so that's why you probably didn't find them. Also, when creating a String from a string literal, standard style is to use .to_string, and there's a pretty significant minority style (which I personally prefer) to use String::from, but given that you're doing concatenation, this would be what I would personally consider idiomatic Rust post-1.58.
When working with filesystems, especially, code that vaguely looks like
this:
```
if !a_file.exists() {
do_something_to_create_file()
}
```
has a bug, of varying degrees of severity. This is called "Time-of-check
to time-of-use," and the problem is roughly:
```
if !a_file.exists() {
// what happens if we get here, but before
// do_something_to_create_file is called, we
// swap to another thread, and another process creates
// a_file, possibly by doing something fancy with symlinks
do_something_to_create_file()
}
```
I don't think there is an actual problem here, probably, but it's better
just to "confidently" call these functions, rather than try and get
fancy. That way the creation is "atomic," if you will.
For more: https://en.wikipedia.org/wiki/Time-of-check_to_time-of-use
Now, the commit line says "partially" because
```
if !extract_destination.exists() {
let _ = download_and_extract(
```
This is doing the same thing, but modifying all of that code to also be
okay with being called this way is a larger lift, and you don't have any
tests, so I am nervous about breaking something :)
You're already returning Result, best to not panic. I decided to try and keep this in the same style you currently have, with a String living in the error type, but you could also just stash the io::Error in there too. Also, I didn't make this change in this diff beacuse it is up to you, but I would not be making a new module for the erorr, I would put it in lib.rs. You're not really buying anything by moving it in there, just making yourself import it when you don't have to. Up to you though!
you were even mapping these to a Result, only to call unwrap on them!
You know the drill by now :)
as your application grows here, I think you'll come to appreciate
anyhow. A common suggestion is "thiserr for libraries, anyhow for
applications," and so we're following that. You're effectively putting
main in the library, which isn't bad, but that's why this is not
*strictly* following the letter of that advice, but instead, the spirit.
.context will give you output that looks like this:
```
Error: current directory should exist
Caused by:
No such file or directory (os error 2)
```
I also tweaked the error messages, given that it not existing is the
only issue. For example, maybe the problem is permissions, in which
case, the error would look like this:
```
Error: current directory should exist
Caused by:
Permission deined (os error 13)
```
which is uh, wrong. So as of this commit, this will instead say
```
Error: problem fetching current directory
Caused by:
Permission deined (os error 13)
```
or
```
Error: problem fetching current directory
Caused by:
No such file or directory (os error 2)
```
The other two messages are already pretty good, I'd say.
I think this looks nicer. 100% subjective, of course :)
Okay so this is, in my mind, slightly better error handling. Instead of erasing everything to a String, you're actually holding onto the various error types. This would allow your users to handle errors in a more granular way if they wished. I struggled a bit for naming here. Network IO is still IO, so these feel a bit redundant. However, I suspect separating these two does make sense in some senese, as what you'd do to fix them is different. And merging them into one grand IO error is actually a huge hassle and not helpful for users. So I kept it this way. It's up to taste, really.
as written, this would look stuff up twice, instead, let's do that once.
We don't need an owned version of this to call this method.
Given that you're exporting this enum, you have a choice to make: 1. never add another varaint unless you want to increase your semver major version 2. make it a nonexhasitve enum The latter is a good idea for errors, generally.
So, because we do some work in the conversion of these error messages, *technically* this way is slightly more efficient. You see, with or, we unconditionally also run the "Generate the error" code. But with or_else, we only invoke that code if we actually hit the error here. So in effect, doing this moves the allocation to the error path only, which is nice.
Turns out that now that we're using or_else, clippy recognizes that there's a better method on result, map_err, for this task.
I really prefer using at least unwrap here. If this never happens, they're equivalent, but if it does happen, with the old code, you silently continue, and with the new code, you blow up.
2643ef5 to
9387919
Compare
Now that we've implemented From, this is no longer needed. Nice catch @anonrig!
|
I have rebased, made sure each commit builds and passes
They should not. The answer is kind of long and philosophical. Sorry about that. context is in anyhow. "anyhow for binaries, thiserror for libraries". But why is that the advice? binaries tend to catch errors, and libraries tend to produce errors. everything else derives from this. as a binary, your job is to:
as a library, your job is to:
So as a library, it's our job to sort out that API. and then present exactly this. so you figure out the broader structure of your errors, and then internally, convert things into those errors where appropriate, and return that error. we're passing these things around so we care about efficiency. this is what as a binary, we take those errors, but generally either do something right away, or throw the error upstream until we print out some information about it. so we tend to care about if we can do something other than blow up at an error, and if we do decide to blow up, a way to figure out where the error came from within our applications' code. also because we are blowing up we don't really care about efficiency; this is happening once and is a cold path. this is what this is what ... does all of that make sense? it's a vibe thing on some level for sure. |
thiserror provides a way to generate from implementations. This commit demonstrates using this feature. Nice catch, @weihanglo!
521816a to
f8f2d93
Compare
|
I do not know how you want to configure the tool, so if you could either add it or let me know, that would get this green and ready to go. :) |
|
I think this is ready to go! Thanks! |
|
From a Rust newbie, this was enlightening, great work @steveklabnik! 😁 Also congrats to @anonrig on the project. 👍 |
|
Echoing what others said! This PR and explanations were great @steveklabnik and thanks @anonrig for sharing the repo so it could happen here! I did a stream today where I walked through the PR and discussed the changes and each commit. Basically just a video walkthrough of this PR if anyone is interested: |
Two perf fixes from investigations/pacquet-macos-perf.md that both collapse redundant passes over the tarball bytes. 1. Stream the HTTP body through the gzip decoder (doc fix #4). Replace `response.bytes().await` + `decompress_gzip` (sync zune-inflate) + separate `Integrity::check(&buffer)` with a streaming pipeline: `bytes_stream()` teed through `ssri::IntegrityChecker` into `async_compression::tokio::write:: GzipDecoder`. This removes one full pass over the compressed body (no separate `.check()`), never holds compressed + decompressed buffers at the same time, and lets decompression overlap with network awaits. Integrity is still verified end-to-end before any CAFS blob is written, so a mismatch aborts the install with the same `TarballError::Checksum` behavior as before. 2. Interleave SHA-512 with the per-entry read (doc fix #2). Add `StoreDir::write_cas_file_prehashed`; the tar extraction loop reads + hashes 64 KiB chunks in one pass instead of `read_to_end` followed by a separate `Sha512::digest(buffer)` inside `write_cas_file`. The buffer still has to exist (CAFS filename is hash-derived), but two passes over the decompressed entry bytes collapse into one. Mirrors pnpm's `parseTarball` + `addBufferToCafs` in `store/cafs/src/addFilesFromTarball.ts`. Why we diverge from pnpm on (1): upstream `fetching/tarball-fetcher/src/remoteTarballFetcher.ts` + `worker/src/start.ts` + `store/cafs/src/addFilesFromTarball.ts` buffers the whole tarball into a Node `SharedArrayBuffer` and runs `crypto.hash` + `zlib.gunzipSync` sequentially. Observable behavior here is identical — same integrity errors on mismatch, same decompressed bytes, same CAFS layout, same `index.db` rows — so per the "Internal performance divergence is allowed" clause added in the preceding commit, the streaming topology is a pacquet-specific implementation-detail optimization rather than a shape port. Dep changes: swap `zune-inflate` for `async-compression` (gzip + tokio features) + `futures-util`; enable reqwest's `stream` feature; add `sha2` to the tarball crate. Tests: existing `packages_under_orgs_should_work` (real-network end-to-end) and `should_throw_error_on_checksum_mismatch` both pass unchanged, proving the streaming pipeline produces the same decompressed bytes and fails integrity checks the same way.
Two perf fixes from investigations/pacquet-macos-perf.md that both collapse redundant passes over the tarball bytes. 1. Stream the HTTP body through the gzip decoder (doc fix #4). Replace `response.bytes().await` + `decompress_gzip` (sync zune-inflate) + separate `Integrity::check(&buffer)` with a streaming pipeline: `bytes_stream()` teed through `ssri::IntegrityChecker` into `async_compression::tokio::write:: GzipDecoder`. This removes one full pass over the compressed body (no separate `.check()`), never holds compressed + decompressed buffers at the same time, and lets decompression overlap with network awaits. Integrity is still verified end-to-end before any CAFS blob is written, so a mismatch aborts the install with the same `TarballError::Checksum` behavior as before. 2. Interleave SHA-512 with the per-entry read (doc fix #2). Add `StoreDir::write_cas_file_prehashed`; the tar extraction loop reads + hashes 64 KiB chunks in one pass instead of `read_to_end` followed by a separate `Sha512::digest(buffer)` inside `write_cas_file`. The buffer still has to exist (CAFS filename is hash-derived), but two passes over the decompressed entry bytes collapse into one. Mirrors pnpm's `parseTarball` + `addBufferToCafs` in `store/cafs/src/addFilesFromTarball.ts`. Why we diverge from pnpm on (1): upstream `fetching/tarball-fetcher/src/remoteTarballFetcher.ts` + `worker/src/start.ts` + `store/cafs/src/addFilesFromTarball.ts` buffers the whole tarball into a Node `SharedArrayBuffer` and runs `crypto.hash` + `zlib.gunzipSync` sequentially. Observable behavior here is identical — same integrity errors on mismatch, same decompressed bytes, same CAFS layout, same `index.db` rows — so per the "Internal performance divergence is allowed" clause added in the preceding commit, the streaming topology is a pacquet-specific implementation-detail optimization rather than a shape port. Dep changes: swap `zune-inflate` for `async-compression` (gzip + tokio features) + `futures-util`; enable reqwest's `stream` feature; add `sha2` to the tarball crate. Tests: existing `packages_under_orgs_should_work` (real-network end-to-end) and `should_throw_error_on_checksum_mismatch` both pass unchanged, proving the streaming pipeline produces the same decompressed bytes and fails integrity checks the same way.
Fix #2 from `investigations/pacquet-macos-perf.md` turned out to regress wall-time at 0ms loopback by ~2% across three successive CI runs on #278, even after controlling for hyperfine-ordering bias (the previous commit swapped `HEAD main` → `main HEAD`; main stayed ahead by 2%, so the bias hypothesis explains only ~half the gap). Fix #2 is the most likely culprit — it hits every tar entry (tens of thousands across the 1352-snapshot fixture) and replaces an optimized `Vec::read_to_end` + single `Sha512::digest` with a chunked read loop that does per-chunk `try_reserve`, `hasher.update`, and `extend_from_slice`. The streaming per-chunk branches add overhead that the saved second pass doesn't recover at the scale of small npm entries (typical entry body is a few KiB — way below where a second scan of the already-hot cache starts mattering). Revert to the pre-PR shape on the entry loop: let mut buffer = Vec::new(); buffer.try_reserve(prealloc_hint).map_err(...)?; entry.read_to_end(&mut buffer).map_err(...)?; let (file_path, file_hash) = store_dir .write_cas_file(&buffer, file_is_executable) .map_err(...)?; Remove `write_cas_file_prehashed` and the private `write_cas_file_unchecked` helper from `StoreDir` — nothing calls them now. Also drop the two prehashed-specific tests (`…_parity_with_write_cas_file` and `…_debug_asserts_hash_matches_buffer`) and the tarball crate's `sha2` dep, since the one-shot `write_cas_file` on `pacquet-store-dir` is the only consumer of SHA-512 once more. What's still left in the PR after this revert: * compressed-side tee-hash: `bytes_stream()` + per-chunk `IntegrityChecker::input` eliminates the separate `Integrity::check(&buffer)` pass after buffering. Chunk count here is tiny (a handful per tarball) so per-chunk overhead is negligible, and the payoff (skipping one full SHA-512 over the compressed body) scales with tarball size. * integrity check ordering before decompression, matching pnpm's `crypto.hash` → `gunzipSync` sequence. * AGENTS.md's "internal perf divergence is allowed" clause. Next CI bench will tell us whether the remaining tee-hash alone is clean or whether the regression persists and the PR should close.
- Extract `snapshot_cache_key` helper for the `(integrity, pkg_id)` → cache-key derivation. Both call sites (the upfront `cache_keys` build and the warm/cold partition loop) now go through the same helper, so a future change to the resolution-type handling or key shape stays in one place. A drift between the two loops would silently misclassify warm entries as cold and quietly halve install speed. - Document the intentional `.max(4)` floor in `configure_rayon_pool`. Even on a quota-limited 1-2-CPU runner, dropping below 4 puts rayon back into the regime where one thread blocks on `clonefile` while the next ready snapshot can't start; the metadata-journal bottleneck doesn't shrink with quota, so a small intentional oversubscription is the better trade. Both the function-level doc and an inline comment by the `.max(4)` call explain the choice.
#1 (parse_shebang args leading whitespace): `splitn(2, [' ', '\t'])` discarded the whitespace separator, so an upstream-format `#!/bin/sh -e` parsed to `args="-e"` and the rendered shim text emitted one space between prog and the first flag where upstream emits two. Replace `splitn` with a manual `find([' ', '\t'])` + `split_at` that keeps the separator in the args slice, matching upstream's regex group `(.*)$`. Update `parses_direct_shebang` and `parses_env_dash_s_shebang` to assert the leading-space form so the regression direction is pinned. #2 (idempotent skip checked only `.sh` but skipped writes for all three): if a previous install wrote `.sh` correctly but `.cmd`/`.ps1` were missing — older pacquet, partial-write crash — `already_correct` short-circuited and the upgrade path never repaired the siblings. Gate `already_correct` on the `.sh` marker AND the existence of both `.cmd` and `.ps1`. Add `link_bins_rewrites_when_only_sh_flavor_exists` which deletes the two siblings and verifies the second `link_bins` pass re-creates them. Verified by re-breaking the gate temporarily and confirming the new test fails, then reverting. Findings 3, 4, 5, 9 are pending review acceptance; they'll land in a follow-up commit.
Okay! So! Here I am!
The very first thing I want to say is, this code was already really great. If you were a random beginner who asked me to look at their code, I would say "great job" and not offer any of this feedback unless they were looking to actively level up at Rust. And while we don't know each other well, I do know that you're an experienced systems programmer, and looking to level up your Rust.
So I went a little extra. Sorry not sorry.
Each of these commits tells a story, in order. I left detailed commit messages. Some code gets re-written multiple times, but each step should be instructive, so if you only look at the final diff, you won't learn everything. Please take what you want, and throw away what you don't. And I am happy to discuss anything about any of this in detail.
This looks like a fun project!