Skip to content

add version-pin#4

Merged
anonrig merged 10 commits intomainfrom
version-pin
Jul 13, 2023
Merged

add version-pin#4
anonrig merged 10 commits intomainfrom
version-pin

Conversation

@anonrig
Copy link
Copy Markdown
Member

@anonrig anonrig commented Jul 13, 2023

No description provided.

@anonrig anonrig requested a review from steveklabnik July 13, 2023 21:11
@anonrig anonrig merged commit a7d4512 into main Jul 13, 2023
@anonrig anonrig deleted the version-pin branch July 13, 2023 21:45
zkochan added a commit that referenced this pull request Apr 24, 2026
Two perf fixes from investigations/pacquet-macos-perf.md that both
collapse redundant passes over the tarball bytes.

1. Stream the HTTP body through the gzip decoder (doc fix #4).
   Replace `response.bytes().await` + `decompress_gzip` (sync
   zune-inflate) + separate `Integrity::check(&buffer)` with a
   streaming pipeline: `bytes_stream()` teed through
   `ssri::IntegrityChecker` into `async_compression::tokio::write::
   GzipDecoder`. This removes one full pass over the compressed body
   (no separate `.check()`), never holds compressed + decompressed
   buffers at the same time, and lets decompression overlap with
   network awaits. Integrity is still verified end-to-end before any
   CAFS blob is written, so a mismatch aborts the install with the
   same `TarballError::Checksum` behavior as before.

2. Interleave SHA-512 with the per-entry read (doc fix #2). Add
   `StoreDir::write_cas_file_prehashed`; the tar extraction loop
   reads + hashes 64 KiB chunks in one pass instead of
   `read_to_end` followed by a separate `Sha512::digest(buffer)`
   inside `write_cas_file`. The buffer still has to exist (CAFS
   filename is hash-derived), but two passes over the decompressed
   entry bytes collapse into one. Mirrors pnpm's `parseTarball` +
   `addBufferToCafs` in `store/cafs/src/addFilesFromTarball.ts`.

Why we diverge from pnpm on (1): upstream
`fetching/tarball-fetcher/src/remoteTarballFetcher.ts` +
`worker/src/start.ts` + `store/cafs/src/addFilesFromTarball.ts`
buffers the whole tarball into a Node `SharedArrayBuffer` and runs
`crypto.hash` + `zlib.gunzipSync` sequentially. Observable behavior
here is identical — same integrity errors on mismatch, same
decompressed bytes, same CAFS layout, same `index.db` rows — so per
the "Internal performance divergence is allowed" clause added in the
preceding commit, the streaming topology is a pacquet-specific
implementation-detail optimization rather than a shape port.

Dep changes: swap `zune-inflate` for `async-compression` (gzip +
tokio features) + `futures-util`; enable reqwest's `stream` feature;
add `sha2` to the tarball crate.

Tests: existing `packages_under_orgs_should_work` (real-network
end-to-end) and `should_throw_error_on_checksum_mismatch` both pass
unchanged, proving the streaming pipeline produces the same
decompressed bytes and fails integrity checks the same way.
zkochan added a commit that referenced this pull request Apr 24, 2026
Two perf fixes from investigations/pacquet-macos-perf.md that both
collapse redundant passes over the tarball bytes.

1. Stream the HTTP body through the gzip decoder (doc fix #4).
   Replace `response.bytes().await` + `decompress_gzip` (sync
   zune-inflate) + separate `Integrity::check(&buffer)` with a
   streaming pipeline: `bytes_stream()` teed through
   `ssri::IntegrityChecker` into `async_compression::tokio::write::
   GzipDecoder`. This removes one full pass over the compressed body
   (no separate `.check()`), never holds compressed + decompressed
   buffers at the same time, and lets decompression overlap with
   network awaits. Integrity is still verified end-to-end before any
   CAFS blob is written, so a mismatch aborts the install with the
   same `TarballError::Checksum` behavior as before.

2. Interleave SHA-512 with the per-entry read (doc fix #2). Add
   `StoreDir::write_cas_file_prehashed`; the tar extraction loop
   reads + hashes 64 KiB chunks in one pass instead of
   `read_to_end` followed by a separate `Sha512::digest(buffer)`
   inside `write_cas_file`. The buffer still has to exist (CAFS
   filename is hash-derived), but two passes over the decompressed
   entry bytes collapse into one. Mirrors pnpm's `parseTarball` +
   `addBufferToCafs` in `store/cafs/src/addFilesFromTarball.ts`.

Why we diverge from pnpm on (1): upstream
`fetching/tarball-fetcher/src/remoteTarballFetcher.ts` +
`worker/src/start.ts` + `store/cafs/src/addFilesFromTarball.ts`
buffers the whole tarball into a Node `SharedArrayBuffer` and runs
`crypto.hash` + `zlib.gunzipSync` sequentially. Observable behavior
here is identical — same integrity errors on mismatch, same
decompressed bytes, same CAFS layout, same `index.db` rows — so per
the "Internal performance divergence is allowed" clause added in the
preceding commit, the streaming topology is a pacquet-specific
implementation-detail optimization rather than a shape port.

Dep changes: swap `zune-inflate` for `async-compression` (gzip +
tokio features) + `futures-util`; enable reqwest's `stream` feature;
add `sha2` to the tarball crate.

Tests: existing `packages_under_orgs_should_work` (real-network
end-to-end) and `should_throw_error_on_checksum_mismatch` both pass
unchanged, proving the streaming pipeline produces the same
decompressed bytes and fails integrity checks the same way.
zkochan added a commit that referenced this pull request Apr 25, 2026


- Move `configure_rayon_pool` to after `CliArgs::parse()`. clap's
  `--help` / `--version` paths exit inside `parse`, so they never
  reach the rayon pool init now — `pacquet --help` no longer
  spawns worker threads it'll throw away. Other subcommand paths
  still pay the (small) init cost; we can move it into the install
  path specifically as a follow-up if non-install commands are
  ever added that don't need rayon.
- Drop the local `CasPathsArc` / `WarmEntry` aliases in
  `create_virtual_store::run`. They duplicated the value type of
  `pacquet_tarball::PrefetchedCasPaths`; if that map's value type
  changes, the local aliases would silently drift. Letting `warm`'s
  element type be inferred from its `push` calls keeps the binding
  to the upstream type.
- Add three unit tests for `prefetch_cas_paths`:
    * `prefetch_cas_paths_returns_hits_for_live_index_rows` — happy
      path: index row written, CAFS blob present, verify on; result
      contains the expected file map.
    * `prefetch_cas_paths_omits_failed_integrity_entries` — index
      row whose digest matches no on-disk file; verify on; row is
      dropped from the result rather than surfacing a half-populated
      map.
    * `prefetch_cas_paths_skips_filesystem_checks_when_verify_disabled`
      — same shape as the failing-integrity test but with
      `verify_store_integrity = false`; the row surfaces (no fs
      checks), matching pnpm's `verify-store-integrity: false`
      semantics.
@KSXGitHub KSXGitHub mentioned this pull request Apr 30, 2026
KSXGitHub pushed a commit that referenced this pull request Apr 30, 2026
#4 (cmd-shim — chmod target swallowed all errors): the post-write
\`ensure_executable_bits\` call used \`let _ = ...\` which dropped
\`PermissionDenied\`, \`EROFS\`, AppArmor deny and other real failures
alongside the intentionally-ignored \`NotFound\`. Match \`NotFound\`
explicitly and propagate everything else as \`LinkBinsError::Chmod\`,
mirroring pnpm's \`fixBin\` ENOENT guard. Adds two DI-driven
regression tests:

- \`link_bins_propagates_target_chmod_error_via_di\` — \`PermissionDenied\`
  must surface as \`Chmod\`.
- \`link_bins_swallows_target_chmod_not_found_via_di\` — \`NotFound\`
  must remain swallowed.

#5 (package-manager — peer-resolved slot name parsing): \`find_slot_
own_package_dir\` used \`slot_name.rfind('@')\` to strip the version
tail. That works for \`parent@1.0.0\` and \`@scope+parent@1.0.0\` but
breaks for peer-resolved slots: \`to_virtual_store_name\` produces
shapes like \`ts-node@10.9.1_@types+node@18.7.19_typescript@5.1.6\`
where the last \`@\` is inside a peer's version, not at the
package-name boundary. \`rfind\` then split inside the peer spec, the
slot's own package wasn't found, and the slot was silently skipped —
bins of children of every peer-resolved slot were never linked.

Parse from the left instead, skipping a single leading \`@\` that
belongs to a scoped package. The package-name half can never contain
\`@\` after \`to_virtual_store_name\`'s scope normalization (\`/\` →
\`+\`), so the first non-leading \`@\` is the right boundary. Adds
\`link_virtual_store_bins_handles_peer_resolved_slot_name\` using the
exact \`ts-node@10.9.1_@types+node@18.7.19_typescript@5.1.6\` shape
verified by \`pacquet_lockfile::pkg_name_ver_peer::tests::to_virtual_
store_name\`.

Both fixes verified by re-breaking the relevant code path and
confirming the regression tests fail, then reverting.

Drive-by from #9 (style nit on the chain split in
\`SymlinkDirectDependencies::run\`): the chain was broken in the
earlier \`feat(cmd-shim): port directories.bin discovery and Windows
shim formats\` commit (dfdd33a) because \`direct_deps\` is consumed
twice — once by \`symlink_direct_deps_into_node_modules\`, once by
\`link_direct_dep_bins\`'s \`dep_names\` derivation. That can't sit
inside a single chain. Documenting it here in lieu of amending the
earlier commit.

Test count: cmd-shim 70 → 72, package-manager (lib) 38 → 39.
@KSXGitHub KSXGitHub mentioned this pull request May 7, 2026
3 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant