Skip to content

Some feedback#2

Merged
anonrig merged 23 commits intopnpm:mainfrom
steveklabnik:suggestions
Jul 13, 2023
Merged

Some feedback#2
anonrig merged 23 commits intopnpm:mainfrom
steveklabnik:suggestions

Conversation

@steveklabnik
Copy link
Copy Markdown
Collaborator

Okay! So! Here I am!

The very first thing I want to say is, this code was already really great. If you were a random beginner who asked me to look at their code, I would say "great job" and not offer any of this feedback unless they were looking to actively level up at Rust. And while we don't know each other well, I do know that you're an experienced systems programmer, and looking to level up your Rust.

So I went a little extra. Sorry not sorry.

Each of these commits tells a story, in order. I left detailed commit messages. Some code gets re-written multiple times, but each step should be instructive, so if you only look at the final diff, you won't learn everything. Please take what you want, and throw away what you don't. And I am happy to discuss anything about any of this in detail.

This looks like a fun project!

@steveklabnik
Copy link
Copy Markdown
Collaborator Author

Ooh, this will have to be rebased on top of #1. I am happy to do that! But since it will involve changing stuff sort of in the middle of the series, I'm going to wait until you review what's here, and then if you have any feedback, I can handle both the rebase and the feedback at the same time.

@steveklabnik
Copy link
Copy Markdown
Collaborator Author

(After reading over that code, I have some stuff that will make your life easier over there too. But again, I'll wait for that.)

Copy link
Copy Markdown
Member

@anonrig anonrig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Amazing work. Thank you for your feedback. I've had some questions, but greatly happy to see this. One of my last question is, how the usage of .context is only exist in cli and not in other crates. Should they share the same behavior?

Comment thread crates/registry/src/lib.rs Outdated
Comment thread crates/registry/src/lib.rs
Comment thread crates/registry/src/lib.rs Outdated
Comment thread crates/registry/src/error.rs Outdated
Comment thread crates/registry/src/lib.rs
Comment thread crates/cli/src/main.rs
Comment thread crates/cli/src/lib.rs
Comment thread crates/tarball/src/lib.rs Outdated
Comment thread crates/tarball/src/lib.rs Outdated
Comment thread crates/tarball/src/lib.rs
@steveklabnik
Copy link
Copy Markdown
Collaborator Author

I replied to a few things but I have to run; I'll put more later and update the PR as well.

@steveklabnik
Copy link
Copy Markdown
Collaborator Author

Left more comments, I am going to rebase and add more commentary on the stuff from #1 tomorrow.

@rluvaton
Copy link
Copy Markdown

I love open source!

Comment thread crates/registry/src/error.rs Outdated
Previously, this code said "Hey, I will take anything that can be
converted into a &Path," and now it says "Hey I can take anything that
can be turned into a PathBuf."

The former is not *wrong*, but is a bit roundabout. You want to store a
PathBuf. Just take anything that can be turned into a PathBuf. Not only
is this a little simpler, but it also can save you an allocation, at
least conceputally, I don't know if rustc is smart enough to remove the
allocation in code like this:

```rust
let path = PathBuf::from("/etc/passwd");
let manager = RegistryManager::new(&path);
```

With the previous version of the code, this would convert a PathBuf into
a &Path, by using a pointer to it, and then creates a whole new PathBuf
out of it, by allocating. This way instead, we can still take the same
stuff, but if we have a PathBuf already, we skip the roundtrip through a
pointer. In other words, you could just pass `path` in the code above,
whereas you couldn't with the original version.
The *only* thing &String can meaningfully do over &str is examine the
capacity of the underlying string, and you aren't doing that. This
signature is more general, and more idiomatic.
This code works via a feature called "Deref coercion." Basically:

If T implements Deref<Target = U>, and x is a value of type T, then &T
is coerced into &U.

This means we can pass a &String to a function expecting a &str, and it
will Just Work, because String implements Deref<Target=str>.

This is also why the previous commit works and is more general; maybe I
should have said this there!
format strings are a relatively new addition to Rust, so that's why you
probably didn't find them.

Also, when creating a String from a string literal, standard style is to
use .to_string, and there's a pretty significant minority style (which I
personally prefer) to use String::from, but given that you're doing
concatenation, this would be what I would personally consider idiomatic
Rust post-1.58.
When working with filesystems, especially, code that vaguely looks like
this:

```
if !a_file.exists() {
	do_something_to_create_file()
}
```

has a bug, of varying degrees of severity. This is called "Time-of-check
to time-of-use," and the problem is roughly:

```
if !a_file.exists() {
	// what happens if we get here, but before
	// do_something_to_create_file is called, we
	// swap to another thread, and another process creates
	// a_file, possibly by doing something fancy with symlinks
	do_something_to_create_file()
}
```

I don't think there is an actual problem here, probably, but it's better
just to "confidently" call these functions, rather than try and get
fancy. That way the creation is "atomic," if you will.

For more: https://en.wikipedia.org/wiki/Time-of-check_to_time-of-use

Now, the commit line says "partially" because

```
if !extract_destination.exists() {
    let _ = download_and_extract(
```

This is doing the same thing, but modifying all of that code to also be
okay with being called this way is a larger lift, and you don't have any
tests, so I am nervous about breaking something :)
You're already returning Result, best to not panic.

I decided to try and keep this in the same style you currently have,
with a String living in the error type, but you could also just stash
the io::Error in there too.

Also, I didn't make this change in this diff beacuse it is up to you,
but I would not be making a new module for the erorr, I would put it in
lib.rs. You're not really buying anything by moving it in there, just
making yourself import it when you don't have to. Up to you though!
you were even mapping these to a Result, only to call unwrap on them!
as your application grows here, I think you'll come to appreciate
anyhow. A common suggestion is "thiserr for libraries, anyhow for
applications," and so we're following that. You're effectively putting
main in the library, which isn't bad, but that's why this is not
*strictly* following the letter of that advice, but instead, the spirit.

.context will give you output that looks like this:

```
Error: current directory should exist

Caused by:
    No such file or directory (os error 2)
```

I also tweaked the error messages, given that it not existing is the
only issue. For example, maybe the problem is permissions, in which
case, the error would look like this:

```
Error: current directory should exist

Caused by:
    Permission deined (os error 13)
```

which is uh, wrong. So as of this commit, this will instead say

```
Error: problem fetching current directory

Caused by:
    Permission deined (os error 13)
```

or

```
Error: problem fetching current directory

Caused by:
    No such file or directory (os error 2)
```

The other two messages are already pretty good, I'd say.
I think this looks nicer. 100% subjective, of course :)
Okay so this is, in my mind, slightly better error handling. Instead of
erasing everything to a String, you're actually holding onto the various
error types. This would allow your users to handle errors in a more
granular way if they wished.

I struggled a bit for naming here. Network IO is still IO, so these feel
a bit redundant. However, I suspect separating these two does make sense
in some senese, as what you'd do to fix them is different. And merging
them into one grand IO error is actually a huge hassle and not helpful
for users. So I kept it this way. It's up to taste, really.
as written, this would  look stuff up twice, instead, let's do that
once.
We don't need an owned version of this to call this method.
Given that you're exporting this enum, you have a choice to make:

1. never add another varaint unless you want to increase your semver
   major version
2. make it a nonexhasitve enum

The latter is a good idea for errors, generally.
So, because we do some work in the conversion of these error messages,
*technically* this way is slightly more efficient. You see, with or, we
unconditionally also run the "Generate the error" code. But with
or_else, we only invoke that code if we actually hit the error here. So
in effect, doing this moves the allocation to the error path only, which
is nice.
Turns out that now that we're using or_else, clippy recognizes that
there's a better method on result, map_err, for this task.
I really prefer using at least unwrap here. If this never happens,
they're equivalent, but if it does happen, with the old code, you
silently continue, and with the new code, you blow up.
Now that we've implemented From, this is no longer needed. Nice catch
@anonrig!
Comment thread crates/registry/src/error.rs Outdated
@steveklabnik
Copy link
Copy Markdown
Collaborator Author

I have rebased, made sure each commit builds and passes cargo fmt. I am not gonna roll suggestions for #1 into this PR because I'd prefer to land it and start a new one rather than deal with all of these commits :)

One of my last question is, how the usage of .context is only exist in cli and not in other crates. Should they share the same behavior?

They should not. The answer is kind of long and philosophical. Sorry about that.

context is in anyhow. "anyhow for binaries, thiserror for libraries". But why is that the advice?

binaries tend to catch errors, and libraries tend to produce errors. everything else derives from this.

as a binary, your job is to:

  1. catch any errors that happen
  2. either do something to fix the error (like "retry")
  3. or show the error to the user

as a library, your job is to:

  1. return an error to some code above you
  2. at the public API of your library, give users who are using your library the information they need to handle the error, aka steps 2 and 3 above

So as a library, it's our job to sort out that API. and then present exactly this. so you figure out the broader structure of your errors, and then internally, convert things into those errors where appropriate, and return that error. we're passing these things around so we care about efficiency. this is what thiserror does, it helps you define these sorts of errors, and gives you nice tooling to do so.

as a binary, we take those errors, but generally either do something right away, or throw the error upstream until we print out some information about it. so we tend to care about if we can do something other than blow up at an error, and if we do decide to blow up, a way to figure out where the error came from within our applications' code. also because we are blowing up we don't really care about efficiency; this is happening once and is a cold path. this is what anyhow gives you: a Result type that is super easy to use, and methods for it that help you describe why you're throwing this error upstream.

this is what context is; it produces a one-off error type, in a sense. and that's what we want in our program: a list of contexts of where the error came from. but at some point it comes from some library, where we don't bother reporting more deeply to the user, because that doesn't make sense. the CLI doesn't care specially about the internals of the registry management code, it cares only that something bad happened in there. if the library used .context everywhere, it would be not providing as useful errors to us, because all we'd have is "some sort of error description here" and not a rich set of types we can handle. it also would over-allocate and be slower.

... does all of that make sense? it's a vibe thing on some level for sure.

thiserror provides a way to generate from implementations. This commit
demonstrates using this feature. Nice catch, @weihanglo!
@steveklabnik
Copy link
Copy Markdown
Collaborator Author

cargo deny is failing simply because this is the first time that it's actually being run, because this is the first PR that touches Cargo.toml. If you actually run the command on any of the commits, it always fails.

I do not know how you want to configure the tool, so if you could either add it or let me know, that would get this green and ready to go. :)

@anonrig
Copy link
Copy Markdown
Member

anonrig commented Jul 13, 2023

I think this is ready to go! Thanks!

@anonrig anonrig merged commit 3f6e0ae into pnpm:main Jul 13, 2023
@nicoddemus
Copy link
Copy Markdown

From a Rust newbie, this was enlightening, great work @steveklabnik! 😁

Also congrats to @anonrig on the project. 👍

@steveklabnik steveklabnik deleted the suggestions branch July 13, 2023 22:14
@coreyja
Copy link
Copy Markdown

coreyja commented Jul 16, 2023

Echoing what others said! This PR and explanations were great @steveklabnik and thanks @anonrig for sharing the repo so it could happen here!

I did a stream today where I walked through the PR and discussed the changes and each commit. Basically just a video walkthrough of this PR if anyone is interested:
https://www.twitch.tv/videos/1873539226

zkochan added a commit that referenced this pull request Apr 24, 2026
Two perf fixes from investigations/pacquet-macos-perf.md that both
collapse redundant passes over the tarball bytes.

1. Stream the HTTP body through the gzip decoder (doc fix #4).
   Replace `response.bytes().await` + `decompress_gzip` (sync
   zune-inflate) + separate `Integrity::check(&buffer)` with a
   streaming pipeline: `bytes_stream()` teed through
   `ssri::IntegrityChecker` into `async_compression::tokio::write::
   GzipDecoder`. This removes one full pass over the compressed body
   (no separate `.check()`), never holds compressed + decompressed
   buffers at the same time, and lets decompression overlap with
   network awaits. Integrity is still verified end-to-end before any
   CAFS blob is written, so a mismatch aborts the install with the
   same `TarballError::Checksum` behavior as before.

2. Interleave SHA-512 with the per-entry read (doc fix #2). Add
   `StoreDir::write_cas_file_prehashed`; the tar extraction loop
   reads + hashes 64 KiB chunks in one pass instead of
   `read_to_end` followed by a separate `Sha512::digest(buffer)`
   inside `write_cas_file`. The buffer still has to exist (CAFS
   filename is hash-derived), but two passes over the decompressed
   entry bytes collapse into one. Mirrors pnpm's `parseTarball` +
   `addBufferToCafs` in `store/cafs/src/addFilesFromTarball.ts`.

Why we diverge from pnpm on (1): upstream
`fetching/tarball-fetcher/src/remoteTarballFetcher.ts` +
`worker/src/start.ts` + `store/cafs/src/addFilesFromTarball.ts`
buffers the whole tarball into a Node `SharedArrayBuffer` and runs
`crypto.hash` + `zlib.gunzipSync` sequentially. Observable behavior
here is identical — same integrity errors on mismatch, same
decompressed bytes, same CAFS layout, same `index.db` rows — so per
the "Internal performance divergence is allowed" clause added in the
preceding commit, the streaming topology is a pacquet-specific
implementation-detail optimization rather than a shape port.

Dep changes: swap `zune-inflate` for `async-compression` (gzip +
tokio features) + `futures-util`; enable reqwest's `stream` feature;
add `sha2` to the tarball crate.

Tests: existing `packages_under_orgs_should_work` (real-network
end-to-end) and `should_throw_error_on_checksum_mismatch` both pass
unchanged, proving the streaming pipeline produces the same
decompressed bytes and fails integrity checks the same way.
zkochan added a commit that referenced this pull request Apr 24, 2026
Two perf fixes from investigations/pacquet-macos-perf.md that both
collapse redundant passes over the tarball bytes.

1. Stream the HTTP body through the gzip decoder (doc fix #4).
   Replace `response.bytes().await` + `decompress_gzip` (sync
   zune-inflate) + separate `Integrity::check(&buffer)` with a
   streaming pipeline: `bytes_stream()` teed through
   `ssri::IntegrityChecker` into `async_compression::tokio::write::
   GzipDecoder`. This removes one full pass over the compressed body
   (no separate `.check()`), never holds compressed + decompressed
   buffers at the same time, and lets decompression overlap with
   network awaits. Integrity is still verified end-to-end before any
   CAFS blob is written, so a mismatch aborts the install with the
   same `TarballError::Checksum` behavior as before.

2. Interleave SHA-512 with the per-entry read (doc fix #2). Add
   `StoreDir::write_cas_file_prehashed`; the tar extraction loop
   reads + hashes 64 KiB chunks in one pass instead of
   `read_to_end` followed by a separate `Sha512::digest(buffer)`
   inside `write_cas_file`. The buffer still has to exist (CAFS
   filename is hash-derived), but two passes over the decompressed
   entry bytes collapse into one. Mirrors pnpm's `parseTarball` +
   `addBufferToCafs` in `store/cafs/src/addFilesFromTarball.ts`.

Why we diverge from pnpm on (1): upstream
`fetching/tarball-fetcher/src/remoteTarballFetcher.ts` +
`worker/src/start.ts` + `store/cafs/src/addFilesFromTarball.ts`
buffers the whole tarball into a Node `SharedArrayBuffer` and runs
`crypto.hash` + `zlib.gunzipSync` sequentially. Observable behavior
here is identical — same integrity errors on mismatch, same
decompressed bytes, same CAFS layout, same `index.db` rows — so per
the "Internal performance divergence is allowed" clause added in the
preceding commit, the streaming topology is a pacquet-specific
implementation-detail optimization rather than a shape port.

Dep changes: swap `zune-inflate` for `async-compression` (gzip +
tokio features) + `futures-util`; enable reqwest's `stream` feature;
add `sha2` to the tarball crate.

Tests: existing `packages_under_orgs_should_work` (real-network
end-to-end) and `should_throw_error_on_checksum_mismatch` both pass
unchanged, proving the streaming pipeline produces the same
decompressed bytes and fails integrity checks the same way.
zkochan added a commit that referenced this pull request Apr 24, 2026
Fix #2 from `investigations/pacquet-macos-perf.md` turned out to
regress wall-time at 0ms loopback by ~2% across three successive
CI runs on #278, even after controlling for hyperfine-ordering
bias (the previous commit swapped `HEAD main` → `main HEAD`; main
stayed ahead by 2%, so the bias hypothesis explains only ~half the
gap). Fix #2 is the most likely culprit — it hits every tar entry
(tens of thousands across the 1352-snapshot fixture) and replaces
an optimized `Vec::read_to_end` + single `Sha512::digest` with a
chunked read loop that does per-chunk `try_reserve`,
`hasher.update`, and `extend_from_slice`. The streaming
per-chunk branches add overhead that the saved second pass doesn't
recover at the scale of small npm entries (typical entry body
is a few KiB — way below where a second scan of the already-hot
cache starts mattering).

Revert to the pre-PR shape on the entry loop:

    let mut buffer = Vec::new();
    buffer.try_reserve(prealloc_hint).map_err(...)?;
    entry.read_to_end(&mut buffer).map_err(...)?;
    let (file_path, file_hash) = store_dir
        .write_cas_file(&buffer, file_is_executable)
        .map_err(...)?;

Remove `write_cas_file_prehashed` and the private
`write_cas_file_unchecked` helper from `StoreDir` — nothing calls
them now. Also drop the two prehashed-specific tests
(`…_parity_with_write_cas_file` and
`…_debug_asserts_hash_matches_buffer`) and the tarball crate's
`sha2` dep, since the one-shot `write_cas_file` on
`pacquet-store-dir` is the only consumer of SHA-512 once more.

What's still left in the PR after this revert:

* compressed-side tee-hash: `bytes_stream()` + per-chunk
  `IntegrityChecker::input` eliminates the separate
  `Integrity::check(&buffer)` pass after buffering. Chunk count
  here is tiny (a handful per tarball) so per-chunk overhead is
  negligible, and the payoff (skipping one full SHA-512 over the
  compressed body) scales with tarball size.
* integrity check ordering before decompression, matching pnpm's
  `crypto.hash` → `gunzipSync` sequence.
* AGENTS.md's "internal perf divergence is allowed" clause.

Next CI bench will tell us whether the remaining tee-hash alone
is clean or whether the regression persists and the PR should
close.
zkochan added a commit that referenced this pull request Apr 25, 2026
- Extract `snapshot_cache_key` helper for the `(integrity, pkg_id)`
  → cache-key derivation. Both call sites (the upfront
  `cache_keys` build and the warm/cold partition loop) now go
  through the same helper, so a future change to the resolution-type
  handling or key shape stays in one place. A drift between the two
  loops would silently misclassify warm entries as cold and quietly
  halve install speed.
- Document the intentional `.max(4)` floor in `configure_rayon_pool`.
  Even on a quota-limited 1-2-CPU runner, dropping below 4 puts
  rayon back into the regime where one thread blocks on `clonefile`
  while the next ready snapshot can't start; the metadata-journal
  bottleneck doesn't shrink with quota, so a small intentional
  oversubscription is the better trade. Both the function-level doc
  and an inline comment by the `.max(4)` call explain the choice.
@KSXGitHub KSXGitHub mentioned this pull request Apr 30, 2026
KSXGitHub pushed a commit that referenced this pull request Apr 30, 2026
#1 (parse_shebang args leading whitespace): `splitn(2, [' ', '\t'])`
discarded the whitespace separator, so an upstream-format `#!/bin/sh -e`
parsed to `args="-e"` and the rendered shim text emitted one space
between prog and the first flag where upstream emits two. Replace
`splitn` with a manual `find([' ', '\t'])` + `split_at` that keeps the
separator in the args slice, matching upstream's regex group `(.*)$`.
Update `parses_direct_shebang` and `parses_env_dash_s_shebang` to
assert the leading-space form so the regression direction is pinned.

#2 (idempotent skip checked only `.sh` but skipped writes for all
three): if a previous install wrote `.sh` correctly but `.cmd`/`.ps1`
were missing — older pacquet, partial-write crash — `already_correct`
short-circuited and the upgrade path never repaired the siblings.
Gate `already_correct` on the `.sh` marker AND the existence of both
`.cmd` and `.ps1`. Add `link_bins_rewrites_when_only_sh_flavor_exists`
which deletes the two siblings and verifies the second `link_bins`
pass re-creates them. Verified by re-breaking the gate temporarily
and confirming the new test fails, then reverting.

Findings 3, 4, 5, 9 are pending review acceptance; they'll land in a
follow-up commit.
@KSXGitHub KSXGitHub mentioned this pull request May 7, 2026
3 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants