Skip to content

Commit f3dd528

Browse files
arskaclaude
andcommitted
Fix truncated stdin backups by closing pipe before signaling done
Application-aware backups (stdin backups from pod exec) were being truncated because the pipe writer was closed after signaling done to the backup trigger. This created a race condition: 1. StreamWithContext finishes writing data to the pipe 2. done <- true signals the backup trigger to call cmd.Wait() 3. defer stdoutWriter.Close() hasn't executed yet 4. restic finishes before all data flows through the pipe Fix: close the pipe writer before signaling done, ensuring restic's stdin receives all data and EOF before the backup command exits. Also improved error handling: use CloseWithError on stream failure instead of os.Exit(1), propagating the error through the pipe to the reader. Fixes #1109 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: Aarno Aukia <aarno.aukia@vshn.ch>
1 parent d9f5a3b commit f3dd528

1 file changed

Lines changed: 13 additions & 7 deletions

File tree

restic/kubernetes/pod_exec.go

Lines changed: 13 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -62,22 +62,28 @@ func PodExec(pod BackupPod, log logr.Logger) (*ExecData, error) {
6262
var stdoutReader, stdoutWriter = io.Pipe()
6363
done := make(chan bool, 1)
6464
go func() {
65-
err = exec.StreamWithContext(context.Background(), remotecommand.StreamOptions{
65+
streamErr := exec.StreamWithContext(context.Background(), remotecommand.StreamOptions{
6666
Stdin: nil,
6767
Stdout: stdoutWriter,
6868
Stderr: logging.NewErrorWriter(log.WithName(pod.PodName)),
6969
Tty: false,
7070
})
7171

72-
defer stdoutWriter.Close()
73-
done <- true
74-
75-
if err != nil {
76-
execLogger.Error(err, "streaming data failed", "namespace", pod.Namespace, "pod", pod.PodName)
77-
// we just completely hard fail the whole backup pod
72+
// Close the writer before signaling done so that the pipe reader
73+
// (restic's stdin) receives all data and an EOF before the backup
74+
// command is waited on. Previously, done was signaled before close,
75+
// causing a race where restic could finish before all data was
76+
// flushed through the pipe, resulting in truncated backups.
77+
if streamErr != nil {
78+
execLogger.Error(streamErr, "streaming data failed", "namespace", pod.Namespace, "pod", pod.PodName)
79+
stdoutWriter.CloseWithError(streamErr)
80+
done <- true
81+
// Hard fail the backup pod so the Kubernetes Job is marked as failed
7882
os.Exit(1)
7983
return
8084
}
85+
stdoutWriter.Close()
86+
done <- true
8187
}()
8288

8389
data := &ExecData{

0 commit comments

Comments
 (0)