Skip to content

Use websockets instead of SPDY for streaming#1183

Merged
Kidswiss merged 4 commits intomasterfrom
change/use_websocket_stdout
Mar 25, 2026
Merged

Use websockets instead of SPDY for streaming#1183
Kidswiss merged 4 commits intomasterfrom
change/use_websocket_stdout

Conversation

@Kidswiss
Copy link
Copy Markdown
Contributor

@Kidswiss Kidswiss commented Mar 24, 2026

Summary

This PR switches the streaming protocol for application aware backups from SPDY to WebSockets. This is how upstream kubectl exec does it and should fix truncated backups (#1109). Websocket exec is supported by default since Kubernetes 1.31.

For clusters older than 1.31 a fallback option exists --insecure-allow-podexec-spdy-fallback=true. The fallback is disabled by default because of the risk of silent data corruption.

Checklist

For Code changes

  • Categorize the PR by setting a good title and adding one of the labels:
    bug, enhancement, documentation, change, breaking, dependency
    as they show up in the changelog
  • PR contains the label area:operator
  • Commits are signed off
  • Link this PR to related issues
  • I have not made any changes in the charts/ directory.

@Kidswiss Kidswiss requested a review from a team as a code owner March 24, 2026 09:21
@Kidswiss Kidswiss added the bug Something isn't working label Mar 24, 2026
@Kidswiss Kidswiss requested review from arska and tobru and removed request for a team March 24, 2026 09:21
@Kidswiss Kidswiss requested a review from bastjan March 24, 2026 09:39
@Kidswiss Kidswiss force-pushed the change/use_websocket_stdout branch 2 times, most recently from 4cd56dd to 5cb3583 Compare March 24, 2026 09:48
Copy link
Copy Markdown
Member

@bastjan bastjan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 🚀

@bastjan
Copy link
Copy Markdown
Member

bastjan commented Mar 24, 2026

I was able to trigger the issue by forcing the old SPDY streaming in kubectl:

❯ for i in 1 2 3 4 5; do KUBECTL_REMOTE_COMMAND_WEBSOCKETS=false kubectl exec "${POD:?}" -- sh -c 'seq 1 200000000' | tail -n1; done 
200000000
19
200000000
200000000
19%                                                                                                                                                                             

Which leads me to believe we should add an option to disable the fallback and not risk any silent fails.

Edit:

I propose something like an environment variable K8UP_INSECURE_ALLOW_PODEXEC_SPDY_FALLBACK with a description that it can lead to incomplete backups.

@Kidswiss
Copy link
Copy Markdown
Contributor Author

I like the idea. I'm going to add such an env var.

@Kidswiss
Copy link
Copy Markdown
Contributor Author

Kidswiss commented Mar 24, 2026

@bastjan good call with this! Because I now found a RBAC issue that would otherwise cause a fallback.

Websocket exec calls are a "GET" while SPY are a "PUT".

EDIT: I'm also going against the checkbox here and also change the chart, because without the change the e2e tests will fail.

@bastjan
Copy link
Copy Markdown
Member

bastjan commented Mar 24, 2026

@bastjan good call with this! Because I now found a RBAC issue that would otherwise cause a fallback.

RBAC my eternal "follow up fix for ..." enemy.

@Kidswiss Kidswiss force-pushed the change/use_websocket_stdout branch 2 times, most recently from e2f4952 to 446f296 Compare March 25, 2026 07:57
@Kidswiss Kidswiss requested a review from bastjan March 25, 2026 08:06
@Kidswiss
Copy link
Copy Markdown
Contributor Author

@bastjan can you have a final quick look? Had to kick the tests a bit due to the k8s version bump.

Copy link
Copy Markdown
Member

@bastjan bastjan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably release this at least as a minor and add a doc page we can link to in the release.

@bastjan
Copy link
Copy Markdown
Member

bastjan commented Mar 25, 2026

Something like

diff --git a/docs/modules/ROOT/pages/explanations/system-requirements.adoc b/docs/modules/ROOT/pages/explanations/system-requirements.adoc
index f5022cfa..8097cf06 100644
--- a/docs/modules/ROOT/pages/explanations/system-requirements.adoc
+++ b/docs/modules/ROOT/pages/explanations/system-requirements.adoc
@@ -2,7 +2,9 @@
 
 == Supported Kubernetes Versions
 
-K8up (v2 or later) officially only supports recent stable Kubernetes versions.
+K8up (v2 or later) officially only supports recent stable Kubernetes versions with support for WebSocket connections in the API server (`1.31` or later).
+
+Older Kubernetes versions may work by setting `--insecure-allow-podexec-spdy-fallback=true` in the operator, but are not officially supported and support may be removed in future releases.
 
 K8up v1 (not maintained anymore) supports legacy Kubernetes clusters such as OpenShift `3.11` (Kubernetes 1.11).
 

@Kidswiss Kidswiss force-pushed the change/use_websocket_stdout branch from 4eafdf5 to 2599180 Compare March 25, 2026 11:42
This commit will try to do an upgrade to a websocket connection. If
it fails, a log will be emitted and it will fall back to a SPDY
connection.

Signed-off-by: Simon Beck <simon.beck@vshn.ch>
Websocket fallback is now disabled by default, since it's possible to
produce silent corruption.

Signed-off-by: Simon Beck <simon.beck@vshn.ch>
Signed-off-by: Simon Beck <simon.beck@vshn.ch>
Signed-off-by: Simon Beck <simon.beck@vshn.ch>
@Kidswiss Kidswiss force-pushed the change/use_websocket_stdout branch from a520428 to f4007fb Compare March 25, 2026 11:44
@Kidswiss
Copy link
Copy Markdown
Contributor Author

Thanks for the review!

I'll do a minor k8up release after the tests are done.

@Kidswiss Kidswiss enabled auto-merge March 25, 2026 11:46
@Kidswiss Kidswiss merged commit 0240a06 into master Mar 25, 2026
14 checks passed
@Kidswiss Kidswiss deleted the change/use_websocket_stdout branch March 25, 2026 11:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Application Aware Backups truncate database dumps 100M+

2 participants