Skip to content

Tritonadm nocloud import#27

Open
nshalman wants to merge 37 commits intomainfrom
tritonadm-nocloud-import
Open

Tritonadm nocloud import#27
nshalman wants to merge 37 commits intomainfrom
tritonadm-nocloud-import

Conversation

@nshalman
Copy link
Copy Markdown
Collaborator

@nshalman nshalman commented May 5, 2026

[root@headnode (coal) ~]# tritonadm image fetch-nocloud --vendor alpine --release latest --target imgapi
Fetching Alpine releases.json ...
Downloading nocloud_alpine-3.23.4-x86_64-uefi-cloudinit-r0.qcow2
  URL: https://dl-cdn.alpinelinux.org/alpine/v3.23/releases/cloud/nocloud_alpine-3.23.4-x86_64-uefi-cloudinit-r0.qcow2
Hashing source image ...
Fetching https://dl-cdn.alpinelinux.org/alpine/v3.23/releases/cloud/nocloud_alpine-3.23.4-x86_64-uefi-cloudinit-r0.qcow2.sha512
Checksum OK (sha512): 849196e26640b33fb1f106d46cebdf341c715f7f49363564eacece8097464b6af7041618e94c4eb1a006394663252a86c0fb3158fee5d0ffb094899730679a71
Creating zvol: zones/tritonadm-nocloud-63a7870c-b7bd-4d3b-84ba-e8ca3420e703 (214 MiB virtual)
Writing image to zvol (224395264 bytes from qcow2) ...
Snapshotting zvol ...
Exporting ZFS stream → /var/tmp/tritonadm/nocloud/image/alpine-3.23/alpine-3.23-3.23.4.x86_64.zfs ...
Compressing image ...
Destroying zvol: zones/tritonadm-nocloud-63a7870c-b7bd-4d3b-84ba-e8ca3420e703

Build complete.
  Image:    /var/tmp/tritonadm/nocloud/image/alpine-3.23/alpine-3.23-3.23.4.x86_64.zfs.gz
  Manifest: /var/tmp/tritonadm/nocloud/image/alpine-3.23/alpine-3.23-3.23.4.json
  UUID:     2eb2ec14-55c6-5795-8ff6-710617f90e97

Importing image manifest 2eb2ec14-55c6-5795-8ff6-710617f90e97...
Imported: alpine-3.23-nocloud v3.23.4
Uploading image file...
Image file uploaded.
Activating image...
Image 2eb2ec14-55c6-5795-8ff6-710617f90e97 imported and activated.
[root@headnode (coal) ~]# imgadm avail | grep alpine
2eb2ec14-55c6-5795-8ff6-710617f90e97  alpine-3.23-nocloud             3.23.4                                        linux    zvol          2026-05-06

nshalman and others added 24 commits May 5, 2026 15:27
Adds a new `tritonadm image fetch-nocloud --vendor <name> --release <token>`
subcommand that fetches a CloudInit nocloud image from an upstream
vendor and converts it into a SmartOS/Triton zvol image + IMGAPI
manifest, in-process — no `qemu-img` dependency.

This is the Rust translation of the bash pipeline in
`target/triton-nocloud-images/build.sh`, with the goal of letting
SmartOS hosts ingest stock vendor images on demand instead of receiving
ones we have repackaged. POC ships Ubuntu (noble/jammy/focal/oracular,
plus `latest`); other vendors follow the same `VendorProfile` trait
shape.

Pipeline: download → TLS-fetched SHA256SUMS verify → open qcow2 in
memory via the `qcow` crate → create zvol of the qcow's actual virtual
disk size → stream decoded clusters into `/dev/zvol/rdsk/<ds>` → snap →
send → gzip → typed IMGAPI manifest. Two `indicatif` progress bars
(download + zvol write).

Zone-aware: GZ defaults to the `zones` parent dataset; NGZ requires a
delegated dataset (`zones/<zone>/data` with `zoned=on`) and bails
otherwise. `--dataset` overrides for either.

Design rationale and follow-ups in
`docs/design/tritonadm-nocloud-import.md`.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
CWD-relative defaults are brittle when running in the GZ from a
read-only or unexpected directory. Move both --workdir and
--output-dir defaults to a stable absolute location under
/var/tmp/tritonadm/nocloud/{cache,image}/<vendor>-<series>/.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The Ubuntu vendor profile now consults
https://cloud-images.ubuntu.com/releases/streams/v1/com.ubuntu.cloud:released:download.json
(the same machine-readable feed cloud-init / MAAS / OpenStack
consume) to resolve `latest` and named codenames. This replaces the
hardcoded series table — though the table is kept as an air-gapped
fallback if streams is unreachable.

Three wins over the hardcoded path:

- `latest` is self-updating: when a new LTS ships, `--release latest`
  picks it up with no tool update.
- The manifest `version` is the canonical upstream build serial
  (e.g. `20260321`) instead of today's date, so two runs against the
  same upstream produce identical manifest versions.
- The streams JSON includes the sha256, so the verifier is now
  `Sha256Pinned` from one TLS roundtrip rather than a second roundtrip
  to fetch `SHA256SUMS`.

6 new unit tests against a small fixture cover latest-LTS selection,
codename and version-token resolution, non-LTS skipping, and URL
construction.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Manifest UUIDs are now `v5(NAMESPACE, source_image_sha256_hex)`, where
NAMESPACE is itself a stable v5 UUID derived from the URL
`https://tritondatacenter.com/tritonadm/nocloud`. Two runs against
the same upstream image produce the same manifest UUID, regardless of
when or where they run, which lets IMGAPI dedupe correctly.

The `Verifier` trait signature changes accordingly: it now takes a
precomputed sha256 hex string instead of a path, since the pipeline
needs the hash for both verification and UUID derivation and we don't
want to hash a 600 MB file twice.

Adds the `v5` feature to the workspace `uuid` dependency. Three new
unit tests cover stability, distinctness, and version-tag.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Four robustness improvements for the nocloud build pipeline:

1. **Recognizable dataset prefix.** Build zvols are now named
   `<parent>/tritonadm-nocloud-<uuid>` instead of `<parent>/<uuid>`,
   so `zfs list | grep tritonadm-nocloud` is unambiguous and the next
   improvement can scope itself safely.

2. **Startup sweep.** Before creating a new build zvol, list the
   parent dataset for any leftover `tritonadm-nocloud-*` children
   from a previous interrupted run (SIGKILL, crash, host reboot)
   and destroy them.

3. **SIGINT handler.** A spawned task watches for Ctrl-C and sets a
   shared cancel flag. The download loop, the qcow→zvol copy loop,
   and the verifier all check this flag and bail cleanly, which lets
   the normal cleanup path (zfs destroy of the in-flight dataset)
   run before exit. Child shellouts (zfs send, gzip) inherit our
   process group so they receive SIGINT directly from the TTY.

4. **Cache mismatch retry.** If a cached file fails verification —
   common when the upstream serial moves between runs but the URL
   path-derived filename collides — log a warning, delete the cache,
   redownload once. Previous behavior bailed on first mismatch.

Verified manually: a stale dataset named `tritonadm-nocloud-DEADBEEF-stale-test`
was correctly detected and swept on the next run.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`--dry-run` resolves vendor metadata and prints the full build plan
without downloading, hashing, or writing anything. For vendors whose
metadata feed includes the upstream sha256 (e.g. Ubuntu's Simple
Streams), the plan also shows the future manifest UUID — derivable
from the sha256 alone, since UUIDs are now v5(NS, sha256). For
vendors that fetch the hash at verification time, the plan notes
that the UUID becomes available after download.

Adds `expected_sha256: Option<String>` to ResolvedImage so the
streams path can surface the value while the SHA256SUMS-fallback
path leaves it None.

Also files several known limitations of the current implementation
in the design doc — parallel-build collisions, SIGKILL cleanup
gaps, and the deferred vendor/format/target list — so the next
iteration has a clean starting point.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two related robustness improvements that make concurrent builds safer.

The startup sweep now skips datasets younger than one hour, since
those are most likely owned by an actively-running concurrent build
rather than crash leftovers. Older datasets that turn out to be busy
are still detected (the destroy fails) and logged as
"busy/refused; leaving in place" rather than reported as cleaned up.

Same-(vendor, release) builds are now serialized via a
`std::fs::File::try_lock` (flock LOCK_EX | LOCK_NB) on
`<workdir>/.lock`. Since `File::try_lock` was stabilized in Rust
1.89, no third-party crate or `unsafe` block is needed. The lock
lives on the FD; the kernel releases it on any process exit, so a
SIGKILL'd run never leaves a stuck lock behind. Different
(vendor, release) pairs use different workdirs and run in parallel
without contention.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Replace the free-form `vendor: String` CLI argument with a `Vendor`
enum that derives `clap::ValueEnum`. The --help output now lists
known vendors automatically (`[possible values: ubuntu]`), bad
values are rejected before any I/O, and shell completion picks
them up. Adding a vendor is a single new variant — no parallel
match in lookup() needed because lookup is now infallible.

The variant→string mapping is derived from `serde::Serialize`
with `rename_all = "kebab-case"`, and `Display` delegates to the
existing `enum_to_display` helper, matching the pattern used
elsewhere in tritonadm. No string-matching boilerplate per variant.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Debian publishes generic cloud images at
`https://cloud.debian.org/images/cloud/<codename>/latest/`, with a
sibling `SHA512SUMS` file (SHA-512, not SHA-256). The new vendor
profile picks the `genericcloud` qcow2 — its cloud-init auto-detects
SmartOS's NoCloud datasource on bhyve.

Verifier work to support this:

- Generalize the SHA-256 sums-file parser into `parse_sums_file`
  (hash-agnostic; whatever hex appears in the first column wins).
- Add `Sha512SumsTls` alongside `Sha256SumsTls`. They share
  `fetch_and_parse_sums` for the URL fetch + filename match.
- Generalize `sha256_file` over a `Digest` type parameter and add a
  `sha512_file` companion.
- The `Verifier` trait now takes both `&Path` and the precomputed
  sha256 hex. Verifiers in the SHA-256 family ignore the path;
  `Sha512SumsTls` ignores the precomputed sha256 and hashes the
  file with SHA-512. The pipeline still pre-computes sha256 for
  manifest-UUID derivation regardless.

Releases supported in the table: trixie (current stable, default
for `--release latest`), bookworm (oldstable), bullseye (older
oldstable / LTS). Resolution accepts codename, major version
("13"), or `latest`.

Verified end-to-end on this builder zone: a fresh
`--vendor debian --release trixie` build runs in ~2m18s, derives
a stable v5 UUID from the file's sha256, and produces a valid
IMGAPI manifest pair.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Replace the hardcoded `RELEASES`/`LATEST_STABLE` table with a fetch
of `https://deb.debian.org/debian/dists/<suite>/Release` — the same
plain-RFC822 file apt itself uses to know what `stable` means today.

The user can now pass any of:

- `latest` — alias for `stable`
- symbolic suites — `stable`, `oldstable`, `oldoldstable`, `testing`,
  `unstable`
- codenames — `trixie`, `bookworm`, `bullseye`, `forky`, `sid`, ...

Each resolves at upstream by fetching `dists/<suite>/Release` and
parsing the `Codename` and `Version` header fields. The major number
for the cloud-images filename is parsed from the version string
(e.g. `"13.4"` → `13`). Bogus tokens get a clear 404 from the
Release-file fetch.

The manifest `version` field is now the Debian point-release
(e.g. `"13.4"`) instead of today's date, so two builds against the
same point release produce identical manifest versions. Output
files are named `debian-<codename>-<version>.x86_64.zfs.gz`
accordingly.

No offline-fallback path: we're downloading several hundred MB of
image data over the same network the Release file lives on, so the
extra ~3 KB fetch isn't a meaningful new failure mode.

Verified live for `latest`, `stable`, `oldstable`, `trixie`, and
`bookworm`; bogus tokens fail with the expected error chain.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Three concrete reasons our Release-file-based resolver doesn't
handle development suites today (no Version field, different URL
prefix, different filename pattern), captured in the design doc
so we can come back to it without re-deriving the failure mode.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Resolves via Alpine's release feed at
`https://alpinelinux.org/releases.json` — same file Alpine's web
site renders the "current stable" badge from. Accepts:

- `latest` — newest release in the `latest_stable` branch
- branch (`3.23` or `v3.23`) — newest release in that branch
- full version (`3.23.4`) — exact match

The image lives at
`https://dl-cdn.alpinelinux.org/alpine/v<branch>/releases/cloud/nocloud_alpine-<version>-x86_64-uefi-cloudinit-r0.qcow2`.
The verifier is a new `Sha512SidecarTls` — Alpine ships a per-image
`<file>.sha512` containing only the bare hex hash on a single line,
which is a different shape from a `<HASH>SUMS`-style listing.

Verified end-to-end on this builder zone: a fresh
`--vendor alpine --release latest` run finishes in ~42s (Alpine's
qcow2 is small) and produces a stable v5 UUID derived from the
file's sha256.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`make format` (= `cargo fmt`) across the nocloud module tree.
No behavior changes; tests and clippy still pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds the `Xz` source format end-to-end. Streams the decompressed
bytes straight into the zvol via `lzma-rs` (pure Rust, no liblzma
C dep), so there's no intermediate `.raw` file in the cache and
disk pressure stays bounded by zvol space alone.

To size the zvol correctly without decompressing first, the
pipeline parses the xz Stream Footer + Index up-front:

1. `seek(End - 12)`, `read_exact(12)` for the Footer.
2. Decode the `Backward Size` field to find the Index location.
3. `seek(End - (12 + index_size))`, read the Index, sum the
   per-record `Uncompressed Size` VLIs.

Two seeks and a few hundred bytes total — no full-file scan.
Single-stream xz is supported (the case for every cloud image
we've seen); multi-stream concatenated xz would need to walk
backward across stream boundaries.

A `ProgressWriter` wrapper around the zvol file reports
per-byte progress to indicatif and short-circuits on the SIGINT
cancel flag, so the streaming write integrates with the existing
signal handler.

Adds the FreeBSD vendor profile that uses this:

- Resolves `latest` by GETting
  `https://download.freebsd.org/releases/VM-IMAGES/` and picking
  the numerically highest `X.Y-RELEASE/` entry. Explicit version
  tokens (`15.0`, `15.0-RELEASE`) are also accepted.
- URL: `releases/VM-IMAGES/<ver>-RELEASE/amd64/Latest/FreeBSD-<ver>-RELEASE-amd64-BASIC-CLOUDINIT-zfs.raw.xz`.
- Verifier: new `Sha256BsdSumsTls`, which parses the BSD-traditional
  `SHA256 (filename) = hex` format used in FreeBSD's
  `CHECKSUM.SHA256` files (different shape from Linux SUMS files).

Verified end-to-end: `--vendor freebsd --release latest`
streamed a 6178 MiB zvol in ~10 min on this builder zone (lzma-rs
is pure-Rust slow vs. liblzma; acceptable for the POC).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Talos's nocloud images come from the dynamic Image Factory at
`https://factory.talos.dev/image/<schematic>/v<version>/nocloud-amd64.raw.xz`,
not from the upstream GitHub release. The factory does not publish
per-image sha256 or sha512 sidecars (the obvious URL paths return
HTTP 402), and the `sha256sum.txt` in the GitHub release covers only
metal/ISO assets, not factory-built images.

So the Talos vendor uses a new `TlsTrustOnly` verifier that explicitly
notes the trust model rather than silently skipping. For operators
who want a real hash check, the new general-purpose
`--expected-sha256 <hex>` CLI flag overrides whatever verifier the
vendor chose with `Sha256Pinned`. This works for any vendor — handy
for one-off pinning, audit-trail builds, or test fixtures.

The default (empty) Talos schematic is baked in:
`376567988ad370138ad8b2698212367b8edcb69b5fd68c80be1f2ec7d603b4ba`.
A future `--schematic <id>` flag could let users build customized
Talos images, but the POC ships the vanilla case.

Release resolution: `latest` consults
`https://api.github.com/repos/siderolabs/talos/releases/latest` and
strips the `v` prefix from `tag_name`. Explicit `1.12.7` /
`v1.12.7` are also accepted. Talos rejects cloud-init ssh-key
injection (kubelet/etcd is the only access path), so `ssh_key` is
`false` in the manifest requirements.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Travis pointed out that Talos's Image Factory API documents
.sha256 / .sha512 checksum endpoints — but Sidero gates them
behind enterprise licensing on the public `factory.talos.dev`
(free-tier requests return `HTTP 402 enterprise not enabled`).
Self-hosted and enterprise factories return the checksum
normally.

So the Talos vendor now probes `<image>.sha256` at resolve time:

- 200 with body → use `Sha256SumsTls` (the response is the same
  Linux-style `<hex>  <filename>` format we already parse).
- 402 / any other status / network error → fall back to
  `TlsTrustOnly` with a note that explains the enterprise gating
  and recommends `--expected-sha256` for hash pinning.

Public-factory users see exactly the same behavior as before;
enterprise users get free hash verification with no extra flags.

The factory's normal image-download response also doesn't leak a
checksum via headers (no `Digest`, `ETag`, or `X-Content-SHA256`),
so the .sha256 endpoint is the only mechanism we have.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Add a defense-in-depth check after the download loop: if the server
sent a Content-Length and the bytes we wrote don't match it, bail
with a clear "download truncated" error rather than feed a short
file into the verifier (which would catch it as a checksum
mismatch — correct outcome, less actionable message).

reqwest/hyper should already error on body truncation when
Content-Length is set, but the explicit comparison costs nothing
and catches HTTP/2 stream resets, mid-stream proxy interruption,
and servers that lie about the size. No-op when Content-Length
isn't sent (chunked transfer or streaming response).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`make format` (= `cargo fmt`) across the nocloud module tree.
Whitespace and line-breaking tweaks; no behavior changes. 46 unit
tests still pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two arch-lint no-error-swallowing warnings and three doc-lint
implementation-detail leaks were standing between the tree and a
clean `make quick-check`.

- `set_cookie_header` in triton-api-server now returns
  `Result<(), HttpError>` and propagates instead of logging-and-
  swallowing; all four call sites updated.
- `triton-cli logout` collapses its `if let Err` log-only block
  into a `match` whose `Err` arm both logs and recovers into
  `revoked = false`, which drives the user-facing message.
- The schemars/Progenitor-rationale paragraphs on `Datacenters`,
  `Services`, `MetadataObject`, and `Tags` move from `///` doc
  comments to `//` regular comments so they no longer leak into
  the generated OpenAPI specs as user-facing schema descriptions.

OpenAPI specs regenerated to reflect the trimmed descriptions.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Upstream `qcow` 1.2.0 hardcodes flate2's `zlib-ng-compat` feature on
its dependency line and exposes no `[features]` knob to opt out. That
forces a CMake build of zlib-ng (libz-sys), which breaks any host
without `cmake` installed. Cargo's `[patch]` swaps a crate's source
but cannot rewrite a transitive dependency's feature flags, so a fork
is the cleanest path.

This vendors panda-re/qcow-rs 1.2.0 (MIT) into `libs/qcow` and rewires
the workspace dependency to the path. Three deviations from upstream:

- `Cargo.toml`: drop the `zlib-ng-compat` feature so flate2 falls back
  to its pure-Rust `miniz_oxide` backend. No more libz-sys / cmake.
- `Cargo.toml`: `[lints.rust] unused_parens = "allow"` and
  `[lints.clippy] all = "allow"` so the parent workspace's strict lint
  regime does not policy-check vendored upstream code. The bitfield
  macro expansion trips `unused_parens` on doc-commented fields.
- `tests/parse.rs`: `#[ignore]` the upstream integration test, which
  hardcodes a fixture path under `/home/jamcleod/.panda/...` that does
  not exist outside the original author's machine.

`arch-lint.toml` excludes `libs/qcow` (vendored, not ours).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Wires the delivery-mode flag promised in the design doc's command
surface table:

  - `file` (default): leave artifacts in --output-dir and print the
    suggested `imgadm install` invocation.
  - `smartos`: shell `imgadm install -m <manifest> -f <gz>` against
    the local SmartOS image store. GZ-only; rejected in NGZs with a
    pointer to --target file + manual install.
  - `imgapi`: push to IMGAPI via the existing `tritonadm image
    import` code path. The Import body is factored into a shared
    `import_manifest_and_file` helper that both subcommands call,
    so origin-chain handling, manifest preservation, compression
    detection, and activation are kept in one place.

Also refreshes the design doc to match the current implementation:
xz is no longer "deferred" (lzma-rs streaming + Stream-Footer
size read), the duplicate ubuntu/freebsd/talos rows in the vendor
table are gone, the POC status is replaced with a snapshot of the
five vendors / three formats / 46 tests we ship today, and the
two follow-up lists are consolidated into one.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@nshalman nshalman marked this pull request as ready for review May 6, 2026 01:21
nshalman and others added 5 commits May 5, 2026 22:11
The code-layout block still listed `mod.rs`, `privileged.rs`, and only
ubuntu.rs; the actual layout has `nocloud.rs` + a per-vendor tree
(alpine/debian/freebsd/talos/ubuntu) and a `zfs.rs` shellout module.
The "Two implementations: PfexecPrivileged / FakePrivileged" sentence
was leftover from before the Privileged trait was removed. The
"Trust chain" section was scoped to Ubuntu only; rewrite it as a
per-vendor table so readers can see at a glance which sums file
each vendor uses.

Drop the stale `Xz/Raw not yet emitted` comment (and the now-false
`#[allow(dead_code)]` on Xz) in `SourceFormat` — Xz is emitted by
freebsd and talos. Raw still has no emitter, so its allow stays.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Vendor resolution is just HTTP, so it runs anywhere — moving it
above the SmartOS-specific preflights lets `--dry-run` exercise
release resolution and verifier wiring on a dev box.

Going one step further: when `uname -v` doesn't start with
`joyent_`, force `--dry-run` and print a stderr notice. The build
itself still requires zfs(8) + a delegated dataset, but the
common dev-box use case (`tritonadm image fetch-nocloud --vendor
ubuntu --release latest`) now does something useful instead of
erroring out.

The dry-run dataset display falls back to a placeholder string
when `default_dataset()` can't run (no `zonename` on macOS).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Resolves the Fedora Cloud_Base x86_64 qcow2 from the canonical
`releases.json` feed at https://fedoraproject.org/releases.json,
which carries the upstream sha256 inline — same shape as Ubuntu
Simple Streams, so the verifier is a plain `Sha256Pinned` and
`--dry-run` can show the manifest UUID without downloading.

Accepts `latest` (highest numeric major), bare integers (`42`),
and the conventional `f42` / `Fedora-42` prefixes. The manifest
version uses the build-bearing portion of the filename (e.g.
`44-1.7`) so distinct rebuilds of the same major dedupe sensibly.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
AlmaLinux publishes GenericCloud qcow2 images per major at
`https://repo.almalinux.org/almalinux/<n>/cloud/x86_64/images/`,
with a `-latest.x86_64.qcow2` rolling pointer in each major's
images directory. The sibling `CHECKSUM` is Linux-style
(`<sha256>  <filename>`) and lists both the latest pointer and its
dated alias under the same hash, so we resolve the dated build
once at metadata time and verify with a plain `Sha256Pinned`.

`--release latest` walks the auto-generated index at `/almalinux/`
to pick the highest major; `--release 8` / `9` / `10` pins to a
specific major. The manifest version uses the build identifier
(e.g. `9.7-20260501`) so distinct rebuilds dedupe.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Rocky publishes GenericCloud-Base qcow2 images per major at
`https://download.rockylinux.org/pub/rocky/<n>/images/x86_64/`,
with a per-file BSD-style `<filename>.CHECKSUM` sidecar — same
shape as FreeBSD's CHECKSUM.SHA256, so we reuse the existing
`Sha256BsdSumsTls` verifier. We pick the highest-versioned
dated `-Base-` build by parsing the directory listing,
ignoring rolling pointers and the LVM flavor.

`--release latest` walks `/pub/rocky/` to pick the highest
major (currently 10); `--release 8`/`9`/`10` pins to a major.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
nshalman and others added 8 commits May 5, 2026 22:41
Promote `verify::parse_sums_file` and `verify::parse_bsd_sums_file`
to `pub(super)` so vendor profiles can call them directly. Drop
alma's inline copy of the Linux-style parser, which existed only
because the module-private originals weren't reachable from
sibling modules. The next commit needs the BSD parser from rocky's
release-resolution path; sharing it now keeps the two parsers
co-located with their tests in `verify.rs`.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Rocky's per-file `.CHECKSUM` sidecar is small and TLS-fetched, so
pulling it during release resolution costs one extra round-trip
in exchange for showing the upstream sha256 (and the derived
stable manifest UUID) in `--dry-run` output. The verifier swaps
from `Sha256BsdSumsTls` to `Sha256Pinned` since the hash is now
known at metadata time — same shape as Fedora and AlmaLinux.
The BSD-style parser comes from the now-shared
`verify::parse_bsd_sums_file`.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Arch publishes per-build cloud-image directories at
`https://geo.mirror.pkgbuild.com/images/`, named
`v<YYYYMMDD>.<arch-build-id>/`, plus a `latest/` symlink. Each dir
has an `Arch-Linux-x86_64-cloudimg-<date>.<build>.qcow2` and a
Linux-style `<file>.SHA256` sidecar — same shape as the other
sidecar-bearing vendors, so we list the directory, pick the
highest version (lex sort works for the date-prefixed names),
fetch the sidecar at resolve time, and pin the hash with
`Sha256Pinned`.

`--release latest` picks the newest build; explicit
`v20260501.523211` / `20260501.523211` tokens pin to a specific
build. The series is the literal `rolling` so manifest names
read `arch-rolling-nocloud` rather than `arch-arch-nocloud`.

Detached GPG signatures (`<file>.sig`, `<file>.SHA256.sig`) are
published alongside but not yet consumed; the existing GPG-
verifier follow-up will pick those up across all sidecar vendors.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Oracle's cloud-init-enabled KVM templates live at
`https://yum.oracle.com/templates/OracleLinux/OL<n>/u<u>/x86_64/`
but the only machine-readable index is the human-targeted landing
page at `oracle-linux-templates.html`. Per-image checksums are
embedded in the table HTML, paired with image links inside the
same `<tr>`. We split the HTML on `</tr>`, regex out the kvm-image
href and its sibling `kvm-sha256` `<tt>`, and use the result.

For x86_64 there is exactly one `kvm-b<build>.qcow2` per release
and Oracle's convention is that this image is cloud-init enabled
(the aarch64 builds split into separate `kvm` and `kvm-cloud`
variants, but x86_64 doesn't). Manifest version is
`<major>.<update>-b<build>` (e.g. `9.7-b269`).

`--release latest` walks the page rows to pick the highest major;
`--release 8` / `9` / `10` (with optional `OL` prefix) pins to a
major. Trust roots in TLS to `yum.oracle.com`; `Sha256Pinned`
verifier means dry-run shows the manifest UUID. HTML parsing is
fragile by nature — the parser bails clearly if the page layout
changes, leaving `--expected-sha256` as the manual escape hatch.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
CentOS Stream publishes GenericCloud qcow2 images per stream at
`https://cloud.centos.org/centos/<n>-stream/x86_64/images/`,
with per-file BSD-style `<filename>.SHA256SUM` sidecars. The
release-resolution path lists `/centos/` for `<n>-stream/` dirs,
picks the highest dated build, and pre-fetches the sidecar so the
upstream sha256 is available at metadata time and `--dry-run`
shows the manifest UUID.

`cloud.centos.org` is fronted by CloudFront, which 403s requests
with no User-Agent — every GET sets the same `tritonadm-fetch-
nocloud` identifier the Talos profile already uses.

`--release latest` picks the highest active stream (currently 10);
`--release 9` / `9-stream` / `8` pin to a specific stream.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Leap publishes per-version Minimal-VM Cloud qcow2 images at
`https://download.opensuse.org/distribution/leap/<X>.<Y>/appliances/`
with sibling Linux-style `.sha256` sidecars. `download.opensuse.org`
runs MirrorCache, which exposes `?json=1` directory listings — we
use that for both the version index and the per-version appliance
list rather than scraping HTML.

Filename naming changed between Leap 15.x and 16.x (the latter
dropped the `openSUSE-` prefix and the inner `<X.Y.Z>` block);
the resolver handles both. `--release latest` walks versions
descending and falls through empty appliance dirs (Leap 16.1 is
currently empty), so it correctly returns 16.0. The manifest
version stitches `<X.Y>` and the Build tag (e.g. `16.0-Build16.2`).

Tumbleweed is intentionally skipped — `/tumbleweed/appliances/`
ships only MicroOS-flavored immutable images today, which use
Combustion/Ignition rather than cloud-init nocloud.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Darwin caps `sun_path` at 104 bytes, and `t.TempDir()` on macOS
sits under `/var/folders/<hash>/T/<TestName><N>/001/` — paths
routinely exceeding that, so `net.Listen("unix", …)` returns
EINVAL and the test fails with `bind: invalid argument`.

Skip with `t.Skip()` when `runtime.GOOS == "darwin"` rather than
fight the path-length limit; the same code is exercised by Linux
CI which is the canonical environment for this package.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The qcow2 reader returns zero-filled buffers for unallocated
clusters, so cloud images with sparse virtual disks (Ubuntu's
~3.6 GB cloudimg has ~600 MB of real data; Fedora's is similar)
were paying for gigabytes of redundant zero writes.

In `copy_with_progress` check whether each 1 MiB chunk is all
zeros — if it is, seek the zvol's char device forward instead of
writing. Fresh ZFS zvols are sparse, so unwritten regions stay
unallocated logically (no on-disk block) and the resulting
`zfs send` stream skips them entirely. The all-zero check is a
single linear scan per MiB; rustc autovectorises it.

Applies to qcow2 and raw paths, both of which use
`copy_with_progress`. The xz path goes through `lzma_rs::xz_-
decompress` driving a `Write` impl, so this optimization doesn't
naturally fit there — left as a follow-up.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant