external/server: close driver.stderr.log on Stop so Windows can delete it#4959
Open
mn-ram wants to merge 1 commit into
Open
external/server: close driver.stderr.log on Stop so Windows can delete it#4959mn-ram wants to merge 1 commit into
mn-ram wants to merge 1 commit into
Conversation
…e it
On Windows, `limactl rm --force` failed for instances using an external
driver (wsl2, qemu, vz, krunkit) with:
failed to remove "C:\\Users\\…\\.lima\\<inst>": remove
C:\\Users\\…\\.lima\\<inst>\\driver.stderr.log:
The process cannot access the file because it is being used by another process.
Root cause: server.Start opens driver.stderr.log via os.OpenFile and hands
it to the external-driver subprocess as cmd.Stderr, but the *os.File
handle in the parent (limactl) was never closed. POSIX is happy to delete
files that are still open in some process; Windows is not — any open
HANDLE blocks DeleteFile.
server.Stop sent SIGTERM but did not Wait() the subprocess or close the
inherited log file, so by the time limactl proceeded to remove the
instance directory the parent still held the handle open.
Fix:
* Store the *os.File on ExternalDriver as LogFile so Stop can find it.
* In Stop, after SIGTERM, Wait() the subprocess (reaping the child also
detaches its inherited handle on Windows) and then explicitly Close
the LogFile in the parent.
* In Start, register a defer that closes LogFile on any failure path
between OpenFile and the moment ownership is transferred to
extDriver; disarm it once extDriver.LogFile is set. Without this,
every failed start leaked the FD.
Closes: lima-vm#3736
Signed-off-by: mn-ram <235066282+mn-ram@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #3736.
Symptom (from the issue)
On Windows, after stopping a WSL2 instance,
limactl rm --force <inst>fails with:Root cause
server.Startinpkg/driver/external/server/server.goopensdriver.stderr.logviaos.OpenFileand hands the*os.Fileto the external-driver subprocess ascmd.Stderr:The handle is never closed in the parent (limactl). On POSIX,
unlinkis fine while an FD is held open; on Windows, any open HANDLE blocksDeleteFile.server.StopsentSIGTERMbut neverWait-ed the subprocess and never closed the inherited log file, so by the time the cleanup code inlimactl rm --forcetried to delete the instance directory, the parent process still held the file open and Windows refused to delete it.Affects every external driver on Windows (wsl2, qemu, vz, krunkit), but the user-visible damage is worst on wsl2 because that's the Windows default.
Fix
LogFile *os.Filetoregistry.ExternalDriversoStopcan find the handle.Stop, after sendingSIGTERM, callcmd.Wait()(so the kernel reaps the child and detaches its end of the inherited handle), then explicitlyClose()the parent's handle.Start, add adeferthat closeslogFileon every failure path betweenOpenFileand the success-path assignment toextDriver.LogFile; the defer is disarmed (logFile = nil) once ownership is transferred. Without this, every failedStartwas leaking the FD.+32 / 0, two files. No API change, no architecture change.
Test plan
gofmt -l pkg/driver/external/server/server.go pkg/registry/registry.go— cleango vet ./pkg/driver/external/... ./pkg/registry/...— cleango build ./pkg/driver/external/... ./pkg/registry/... ./cmd/limactl/...— cleanGOOS=windows GOARCH=amd64 go build ./pkg/driver/external/... ./pkg/registry/...— cleango test ./pkg/registry/...— pass