Skip to content

[Feature Request] Portable snapshot provider — commit sandbox rootfs to an OCI image in a registry (cross-runtime / cross-cluster restore) #949

@ferponse

Description

@ferponse

Summary

Building on #694 (Suspend API + pluggable SnapshotProvider), we'd like a SnapshotProvider whose artifact is a portable OCI image pushed to a configurable registry, so a sandbox's state can be restored as a new Sandbox — including on a different runtime or cluster — not only resumed in place.

Use case

We're building an internal developer-sandbox platform (~800 engineers). Each developer gets a cloud sandbox (k8s), and we also want a local runtime (Docker on the laptop) with one-click "move" between local and cloud in both directions:

  • local → cloud: already works — we docker commit locally, push to a shared registry, and create a Sandbox from that image. agent-sandbox just schedules a Pod from it.
  • cloud → local: blocked. To run the cloud sandbox locally we need a portable OCI image of its current rootfs in the registry, but there is no in-cluster mechanism to commit the running container and push it.

Why existing options don't cover it

Proposed feature

A SnapshotProvider (plugging into #694) that:

  1. runs an in-cluster commit of the sandbox container's rootfs (e.g. a Job on the source node using the containerd socket + a committer such as nerdctl),
  2. pushes the resulting image to a configurable OCI registry,
  3. exposes the resulting image reference in the snapshot status,

so restore = create a Sandbox from that image (already supported), and the image can also be pulled by any other OCI runtime (Docker, another cluster).

Prior art / reference design

The OpenSandbox project implements this for its batchsandbox controller: a commit Job mounts the node containerd socket, commits the rootfs, and pushes to a configured registry. Useful reference for the mechanism and its security considerations (the commit Job has node-level runtime access → pin the committer image by digest, trusted registry / admission policy).

Notes / open questions

  • Inherently privileged (node containerd access) and registry-credential-sensitive → would likely be an optional provider, off by default.
  • Could the snapshot artifact create a different/new Sandbox (true portability), not only suspend/resume of the originating one?

Happy to discuss design and help with a contribution.

Refs: #694 (umbrella), #773, #538, #585, #36.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    Status
    Backlog

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions