Windows: support native OpenSSH; no Git for Windows / MSYS2 required#4998
Draft
jandubois wants to merge 9 commits into
Draft
Windows: support native OpenSSH; no Git for Windows / MSYS2 required#4998jandubois wants to merge 9 commits into
jandubois wants to merge 9 commits into
Conversation
6db7973 to
9bd3327
Compare
On Windows, Lima runs cygpath to translate key and socket paths before invoking ssh-keygen and ssh. The translation assumes a Cygwin-based ssh (Git for Windows, MSYS2). With only native Windows OpenSSH installed, cygpath is unavailable and limactl create fails immediately: failed to convert path to mingw, maybe not using Git ssh? exec: "cygpath": executable file not found in %PATH% Pick the ssh.exe that has scp.exe and ssh-keygen.exe in the same directory. MinGit ships ssh.exe alone in usr\bin\ without scp or ssh-keygen, so picking it would break limactl copy and limactl create; pickCompleteSSHOnWindows walks $PATH for an install with all three, and falls back to %SystemRoot%\System32\OpenSSH\ssh.exe (default on Windows 10 build 1803 and later) when nothing on PATH is complete. Detect the toolchain by checking whether cygpath.exe lives alongside the resolved ssh.exe (the layout Git for Windows and MSYS2 use), after filepath.EvalSymlinks so a chocolatey or scoop shim does not throw the sibling probe off. Cache the result per resolved path so detection and the Debug log entry happen once per ssh binary, not once per call site. IsSSHCygwin reports the toolchain boolean for callers that just need the branch. cygpathForSSH returns the sibling cygpath.exe so callers can drive a conversion through the toolchain's own cygpath even when $SSH points at an ssh outside PATH. SftpServerForSSH returns the matching sftp-server binary (/usr/lib/ssh/sftp-server via cygpath for Cygwin, sftp-server.exe next to ssh.exe for native), so reverse-sshfs can hand sshocker a binary from the same install as ssh — preventing mismatched path forms between the ssh process and the spawned sftp-server. Callers land in the next commits. For native Windows OpenSSH, the existing pathForSSH passes paths with forward slashes (C:/Users/...), which native ssh-keygen, ssh, and sshd accept. Cygwin-based ssh keeps the existing cygpath-based behaviour, so users with Git for Windows see no change. This unblocks limactl create on plain Windows. End-to-end use of native Windows OpenSSH still requires a non-ControlMaster path for dynamic port forwarding (hostagent uses ssh -O forward/cancel), since Win32-OpenSSH does not implement SSH multiplexing (PowerShell/Win32-OpenSSH#1328, still open as of Feb 2026). That work is a separate change. Related: lima-vm#4819 Signed-off-by: Jan Dubois <jan.dubois@suse.com>
Two related changes that callers talking to the ssh family of binaries on Windows use together. ParseOpenSSHVersion: match the version banner native Windows OpenSSH emits, both "OpenSSH_for_Windows_X.YpZ" (current releases) and "OpenSSH_for_Windows X.YpZ" (older releases that put a space between "Windows" and the version). The previous regex required a digit immediately after "OpenSSH_", so it misdetected Win32-OpenSSH as version 0.0.0 and Lima then treated it as pre-8.0 legacy ssh in code paths that branch on the version (e.g. scp URL form). When the regex still fails to match, log the unparsed banner at Debug so the silent 0.0.0 downgrade is traceable instead of mysterious. PathForSSH: rename from the previously unexported pathForSSH and export it. copytool.parseCopyPaths needs the same path-translation logic, and duplicating the cygpath-vs-native decision in two packages would invite drift. Add tests for both Win32-OpenSSH banner variants. Signed-off-by: Jan Dubois <jan.dubois@suse.com>
Three changes that together let limactl copy work on Windows when only native Windows OpenSSH is installed (no Git for Windows, no MSYS2). parseCopyPaths: route Windows absolute paths (C:\Users\jan\file, C:/Users/jan/file, UNC) through sshutil.PathForSSH so native ssh sees the forward-slash form (C:/Users/jan/file) instead of failing on a missing cygpath. Use filepath.IsAbs (not VolumeName) to classify paths before splitting on ":", because VolumeName accepts the drive-relative form "C:foo" — which must remain interpretable as instance "C" path "foo" so single-letter instance names keep working. Resolve sshExe lazily and reuse the result across the call so repeated absolute paths do not re-run NewSSHExe. A table test pins the classification of C:\foo, C:/foo, C:foo, and explicit instance:path. scp.go, rsync.go: strip ControlMaster, ControlPath, and ControlPersist from the ssh options on Windows. Native Windows OpenSSH does not implement SSH multiplexing, so leaving these options in caused scp to fail with "getsockname failed: Not a socket" before transferring any bytes. Cygwin-based ssh has known reliability issues with sftp over a mux socket, so unconditional stripping on Windows matches how hostagent and limactl shell already handle this. Mirror the same mux-strip in checkRsyncOnGuest so the rsync-availability probe does not reject a working install on native Windows OpenSSH before "command -v rsync" runs on the guest. Log Debug when the mux-strip fires so a copy-failure trace shows the decision. rsyncTool.IsAvailableOnGuest: warn (not Debug) when parseCopyPaths fails the probe, because the next path falls through to scp's "scp not found on host" error and the user needs the actual root cause (commonly: no ssh.exe on Windows) to diagnose. Signed-off-by: Jan Dubois <jan.dubois@suse.com>
Use compress/gzip directly instead of shelling out to a gzip binary. On Windows, gzip is not part of the base system, so the existing code required Git for Windows or MSYS2 on PATH just to unpack a .tar.gz image during limactl create / start. Hook ctx cancellation into the in-process path: a goroutine closes the input file when ctx fires, so io.Copy returns immediately instead of blocking on disk reads until decompression finishes. Translate the resulting os.ErrClosed back to ctx.Err() only when ctx is genuinely canceled, so a real I/O fault (corrupt stream, short read) is not masked. Build the gzip test fixture in-process too, so TestDownloadCompressed exercises the pure-Go path on every platform instead of skipping on Windows. The bzip2 sub-test still skips on Windows: external bzip2 ships with MSYS2 / Git for Windows but not vanilla Windows, and compress/bzip2 is decode-only — there is no in-process way to build the fixture. Other formats (xz, zstd) still go through the exec path, since they are less common in Lima image URLs and would need extra dependencies for in-process decompression. Follow-ups can migrate them similarly if needed. Signed-off-by: Jan Dubois <jan.dubois@suse.com>
ioutilx.WindowsSubsystemPath: keep cygpath as the preferred backend (it respects any custom fstab the user has configured for MSYS2 / Git for Windows), but add a native fallback for the common absolute drive-letter case (C:\Users\jan -> /c/Users/jan). Without the fallback, plain Windows installs that have neither Git for Windows nor MSYS2 hit a fatal error during fillDefault when computing the default mountPoint for a host mount. After this change, the default mountPoint resolves correctly without external tooling. Reject drive-relative inputs (C:foo) in the fallback rather than silently fabricating /cfoo, since the cygpath-style result would be an unrelated absolute path; surface the cygpath stderr in the "unavailable" log so a misconfigured cygpath surfaces its own error. Split WindowsSubsystemPath into a PATH-resolving thin wrapper and a new exported WindowsSubsystemPathWithCygpath(ctx, cygpathExe, orig) that takes the cygpath binary explicitly. sshutil.PathForSSH switches to the latter via cygpathForSSH, so when $SSH points at a Git-for- Windows ssh outside PATH the conversion still runs through that toolchain's own cygpath (and its fstab) instead of whatever cygpath happens to be on PATH. hostagent.setupMount: route the host-path translation through sshutil.PathForSSH (Cygwin form for Git-for-Windows / MSYS2, native forward-slash for Win32-OpenSSH), and pass sshutil.SftpServerForSSH as OpensshSftpServerBinary so the locally-spawned sftp-server agrees with the path form sshocker sends. When SftpServerForSSH returns "" (no sftp-server next to ssh, or the toolchain's cygpath cannot resolve /usr/lib/ssh/sftp-server), warn before falling through to sshocker's PATH-based auto-detect — on plain Windows this commonly indicates the OpenSSH.Server optional feature is not installed, and a silent mount failure is hard to diagnose. Verified end-to-end on Windows 11 with QEMU 10.2.0 and only native Windows OpenSSH on PATH (no Git for Windows, no MSYS2): reverse-sshfs mounts a host directory into the guest, both sides see the same files, read and write both work, and SftpServerForSSH locates C:\Windows\System32\OpenSSH\sftp-server.exe next to ssh.exe so sshocker uses it directly. Signed-off-by: Jan Dubois <jan.dubois@suse.com>
This experimental Windows-only env var prepended a user-supplied directory to PATH inside limactl. It existed to inject Git for Windows or MSYS2 binaries without altering the user shell's PATH, back when Lima required a Cygwin-style toolchain for ssh, scp, ssh-keygen, and cygpath. After this branch's earlier commits, limactl works directly with native Windows OpenSSH and no longer needs anything from those toolchains. The variable served no purpose for the core flow, and its leading underscore signalled no compatibility promise. Drop the implementation in cmd/limactl/main.go and the corresponding docs entry. An earlier commit on this branch already removed the CI invocations. Signed-off-by: Jan Dubois <jan.dubois@suse.com>
Rewrite the "External tools" section to describe Lima's current behaviour on Windows hosts: - The native Windows OpenSSH Client (ssh.exe, scp.exe, ssh-keygen.exe) ships by default on Windows 10 build 1803 and later, and covers the WSL2 driver. The previous "Windows doesn't ship with ssh.exe, gzip.exe, etc." bullet has been incorrect since 2018 for ssh, and is now incorrect for gzip too (Lima decompresses gzip in pure Go since the prior commits in this branch). - sftp-server.exe is part of OpenSSH Server, an optional Feature on Demand. Only the QEMU driver's reverse-sshfs mounts need it; WSL2 does not. - Git for Windows and MSYS2 remain supported. Lima detects when ssh is a Cygwin-based build and uses cygpath for path translation in that case, which respects any custom MSYS2 fstab. On a vanilla Windows install with neither, Lima falls back to a native conversion that handles the common drive-letter case (C:\Users\jan -> /c/Users/jan). Document the toolchain-swap caveat (review S7): hostagent/mount.go recomputes the reverse-sshfs LocalPath via PathForSSH on every start, while defaults.go resolves the default MountPoint once at create time. A user who creates with Git for Windows on PATH and then starts without it (or vice versa) would see LocalPath change shape between restarts without warning. Sticking with one toolchain for an instance's lifetime avoids the mismatch. Persisting LocalPath at create time would fix the issue at the code layer but is out of scope. Signed-off-by: Jan Dubois <jan.dubois@suse.com>
Move all Windows runner setup steps out of test.yml into reusable
composite actions under .github/actions/, then add two new jobs that
exercise Lima on a "plain Windows" host (no MSYS2 / Git for Windows /
Cygwin).
Composite actions (all under .github/actions/):
windows_wsl2_setup Enable WSL2, import dummy distro.
windows_qemu_install winget install QEMU.
windows_msys2_prep Prepend C:\msys64\usr\bin to $GITHUB_PATH and
install the MSYS2 packages used by
hack/test-templates.sh (openbsd-netcat,
diffutils, socat, w3m).
windows_plain_host Uninstall Git for Windows (Inno Setup
unins000.exe) and MSYS2 (Remove-Item
C:\msys64). Then verify the toolchain is
actually gone across four layers:
- filesystem: known install dirs absent
- PATH: no entry matches msys|mingw|cygwin|
\\Git\\(cmd|bin|usr) in process, machine,
or user scope
- smoking-gun binaries: cygpath, pacman,
mintty, git, git-bash, git-cmd, bash
(excluding System32\bash.exe which is the
legitimate WSL launcher), sh
- registry: no uninstall key with
DisplayName matching the same regex
The check is intentionally broader than just
"PATH scrub regex" so a future runner-image
change that reintroduces these via a new
install path fails the job loudly instead of
silently masking a regression.
windows_plain_build `go build` limactl.exe and the Linux/amd64
guest agent (which limactl looks up at
_output/share/lima/lima-guestagent.Linux-x86_64
when starting an instance). Avoids `make` so
the build does not require MSYS2 make / bash.
windows_plain_templates Resolve Lima's template symlinks via
`git ls-tree` + `git cat-file` so the
plain-qemu job has a working
_output/share/lima/templates/ tree without
relying on whether the checkout preserved
symlinks as NTFS links or 17-byte stubs.
Must run before windows_plain_host because
it needs the git CLI.
Existing jobs:
windows -> windows-wsl2 Renamed for symmetry with the other three
Windows jobs. Update any branch-protection
required checks that reference "windows".
Both windows-wsl2 and windows-qemu now use windows_wsl2_setup /
windows_qemu_install / windows_msys2_prep instead of inlining the
setup. The test step keeps its env-var preamble (HOME_HOST,
HOME_GUEST, LIMACTL_CREATE_ARGS, MSYS2_ENV_CONV_EXCL) but no longer
references _LIMA_WINDOWS_EXTRA_PATH, which an earlier commit on this
branch removed.
New jobs:
windows-plain-wsl2 Smoke test (create / start / shell / copy / stop
/ delete) against templates/experimental/wsl2.yaml,
using only native OpenSSH from
C:\Windows\System32\OpenSSH, wsl.exe, and tar.
Dumps Lima logs on failure (ha.stdout.log,
ha.stderr.log, serial.log, lima.yaml, ssh.config).
windows-plain-qemu Same shape but against templates/default.yaml.
Signed-off-by: Jan Dubois <jan.dubois@suse.com>
Both windows-plain-wsl2 and windows-plain-qemu duplicated the smoke test (create / start / shell / copy / stop / delete) and the on-failure log dump, differing only by the template path, the LIMA_HOME subdirectory, and (potentially) the instance name. Move both steps into .github/actions/windows_plain_smoke_test/action.yml with template, instance-name, and lima-home-suffix as inputs, and call it from both jobs. Each plain job is now ~10 lines of `uses:` plus the smoke test invocation, instead of ~60 lines of inlined PowerShell. Signed-off-by: Jan Dubois <jan.dubois@suse.com>
9bd3327 to
494c7e2
Compare
Contributor
@jandubois Thank you for taking on this! This is actually the most correct way to have the implementation. I will try to help with review/testing. |
6 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR has been created with assistance by Claude Opus 4.7.
It is a refactored and cleaned up version of #4885.
This PR consists of 9 commits that I've kept separate to ease review. I believe they should not be squashed for merging, so we can keep the separate commit messages.
Summary
limactlnow works on Windows hosts that have only the toolchain shipped in a default Windows 10/11 install (native OpenSSH,wsl.exe,tar). Lima previously required Git for Windows or MSYS2 onPATHforcygpath,ssh,ssh-keygen,scp, andgzip. After this branch, none of those external tools are required for the core flow on plainWindows.
Two coupled root causes drove the historical Git-for-Windows / MSYS2 requirement:
Native Windows OpenSSH does not implement SSH multiplexing (PowerShell/Win32-OpenSSH#1328). When Lima needed
ControlMasterfor the legacy ssh-based dynamic port forwarder, native ssh would not work, so Cygwin-built ssh was required.Cygwin-built ssh expects Cygwin-style paths (
/c/Users/USER/...), driving thecygpathdependency throughout the codebase.Three things make the dependency droppable:
Lima's default port forwarder has been Go-native since v1.1.0 (
pkg/portfwd/forward.go, gRPC-tunnelled via vsock). The legacy ssh-based forwarder is opt-in viaLIMA_SSH_PORT_FORWARDER=true, soControlMasteris not actually needed by the default flow.Native Windows OpenSSH ships
sftp-server.exe(when the OpenSSH Server optional feature is installed) and is auto-detected bysshocker. Reverse-sshfs works natively once the host-path translation is corrected.Native Windows OpenSSH treats
-F /dev/nullas an empty config, the same way Cygwin ssh does. Lima's hardcoded-F /dev/nullargument did not need to change.Detection mechanism
sshutil.IsSSHCygwin(sshExe)checks whethercygpath.exelives in the same directory asssh.exe(the layout used by Git for Windows and MSYS2), after resolving symlinks so chocolatey/scoop shims do not throw the directory check off. Results are cached per resolved absolute path.sshutil.PathForSSH(ctx, sshExe, path)dispatches:Cygwin-based ssh → sibling
cygpathfor path translation (preserves any custom MSYS2 fstab the user has configured)Native Windows OpenSSH →
filepath.ToSlash(e.g.C:/Users/USER/...), which nativessh,ssh-keygen,scp, andsftp-serveracceptsshutil.SftpServerForSSH(ctx, sshExe)resolves ansftp-serverbinary that matchesssh.exe's toolchain, so reverse-sshfs's locally-spawned sftp-server consumes paths in the formPathForSSHproduces. Falls back to sshocker's PATH auto-detection (with a Warn log) when no match is found.ioutilx.WindowsSubsystemPathkeepscygpathas the preferred backend but falls back to a native drive-letter conversion (C:\Users\USER→/c/Users/USER) whencygpathis unavailable, so the on-disk Lima config remains identical regardless of toolchain.CI changes
This PR introduces
.github/actions/windows_*composite actions (host scrub, plain-host build, smoke test, WSL2 setup, QEMU install, MSYS2 prep, templates installer) and reorganizes the Windows jobs around them.Jobs that already existed on master
windows-wsl2(formerly keyedwindows:, renamed for symmetry with the new plain variant) andwindows-qemunow build on the new composite actions instead of inlined steps._LIMA_WINDOWS_EXTRA_PATHis gone from both; the toolchain they exercise is a strict subset of before.New jobs
windows-plain-wsl2— builds withgo build(nomake, no MSYS2 bash), uninstalls MSYS2 and Git for Windows from the Windows runner, verifies across filesystem / PATH / smoking-gun binaries / registry that the host is genuinely vanilla, then runs a PowerShell smoke test (create→start→shell→copy→stop→delete) againsttemplates/experimental/wsl2.yaml.windows-plain-qemu— same shape, with QEMU installed via winget and a templates installer that resolves Lima's template symlinks from git's object store (so the action does not need Developer Mode / admin to materialize NTFS symlinks). Smoke test exercises the same lifecycle plus a guest→host copy round-trip againsttemplates/default.yaml.Both new jobs:
Pass
--debugto everylimactlinvocation so the workflow log captures the toolchain-detection result, ssh args, path-translation decisions, etc.if: failure()step dumpsha.stderr.log,ha.stdout.log,serial.log,lima.yaml,ssh.configfrom the instance directory.