feat: wait for pods to be deleted to report version ready #839

avorima · 2025-06-11T21:39:41Z

I was looking through the 1.33 release notes and saw that KEP 3973 had been included in alpha state.
I was trying to find an easy to get this to work in #718, as noted by my comment describing what the conditions were, but there weren't many options. The KEP looks to be the solution I was looking for.
For context: we observed that the apiserver requests sometimes failed immediately after the TCP reported "Ready" after an update. There could be several reasons in our own setup for this (I'm actually thinking about taking a closer look at our loadbalancers) but given that the duration of the probe failures strongly correlate with the duration that terminating pods are still present, I suppose just waiting a bit longer until things settle down is a reasonable approach.

Signed-off-by: Mario Valderrama <[email protected]>

netlify · 2025-06-11T21:39:46Z

✅ Deploy Preview for kamaji-documentation canceled.

Name	Link
🔨 Latest commit	`e1aed40`
🔍 Latest deploy log	https://app.netlify.com/projects/kamaji-documentation/deploys/6849f7a06e24c30008d12375

prometherion · 2025-06-12T06:31:29Z

we observed that the apiserver requests sometimes failed immediately after the TCP reported "Ready" after an update

Could this be related to underlying EndpointSlice and not updated local iptables still sending traffic to the old pods?

avorima · 2025-06-12T08:17:52Z

we observed that the apiserver requests sometimes failed immediately after the TCP reported "Ready" after an update

Could this be related to underlying EndpointSlice and not updated local iptables still sending traffic to the old pods?

Yes, that could also be the case. It's not that common but it happens often enough when testing a high volume of updates

prometherion · 2025-06-16T10:34:33Z

Back in the day, to avoid such minor edge cases, I was adding a pre-stop hook to Pods such as sleep X where X refers to the lag in EndpointSlice update.

However, this solution wouldn't be feasible here since the kube-apiserver is a single binary container with no bash support.

As a workaround, if you're using an Ingress Controller such as an HAProxy one, the option redispatch in the backend could do the trick.

feat: wait for pods to be deleted to report version ready

e1aed40

Signed-off-by: Mario Valderrama <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: wait for pods to be deleted to report version ready #839

feat: wait for pods to be deleted to report version ready #839

Uh oh!

avorima commented Jun 11, 2025

Uh oh!

netlify bot commented Jun 11, 2025 •

edited

Loading

Uh oh!

prometherion commented Jun 12, 2025

Uh oh!

avorima commented Jun 12, 2025

Uh oh!

prometherion commented Jun 16, 2025

Uh oh!

Uh oh!

feat: wait for pods to be deleted to report version ready #839

Are you sure you want to change the base?

feat: wait for pods to be deleted to report version ready #839

Uh oh!

Conversation

avorima commented Jun 11, 2025

Uh oh!

netlify bot commented Jun 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ Deploy Preview for kamaji-documentation canceled.

Uh oh!

prometherion commented Jun 12, 2025

Uh oh!

avorima commented Jun 12, 2025

Uh oh!

prometherion commented Jun 16, 2025

Uh oh!

Uh oh!

netlify bot commented Jun 11, 2025 •

edited

Loading