Skip to content

Already API-evicted Pods do not get evicted by the kubelet eviction manager (memory pressure, ephemeral storage pressure) #122297

@FullyScaled

Description

@FullyScaled

What happened?

When pods have a long terminationGracePeriod and get evicted by downscaling or some other reason (via API) the eviction is initiated which respects that long terminationGracePeriod. This works fine.

If the pods need to be evicted due to memory pressure on the node after the already initiated API eviction there is no more eviction done with the specified evictionMaxPodGracePeriod of the kubelet. This means that the second eviction due to e.g. memory pressure cannot be successful at all when you have large terminationGracePeriod on your pod and there was a prior unrelated API eviction.

image

What did you expect to happen?

The eviction manager triggered eviction due to memory pressure should still be issued with the specified evictionMaxPodGracePeriod even if there was a prior API based eviction with a large terminationGracePeriod.

How can we reproduce it (as minimally and precisely as possible)?

  • Create a workload pod that does ignore sig-terms
  • Specify a large terminationGracePeriod for that pod, do not set any memory limits for the pod
  • Specify a small evictionMaxPodGracePeriod for the kubelet on the node
  • API evict the Pod
  • Make sure the kubelet wants to evict the pod (e.g. by allocating memory inside pod until the node is under memory pressure)

Anything else we need to know?

The bug was not present in k8s 1.25.x

Kubernetes version

1.26.7

Cloud provider

Azure

OS version

GardenLinux 934.10.0

Install tools

Container runtime (CRI) and version (if applicable)

containerd/1.6.20

Related plugins (CNI, CSI, ...) and versions (if applicable)

Metadata

Metadata

Labels

kind/bugCategorizes issue or PR as related to a bug.needs-triageIndicates an issue or PR lacks a `triage/foo` label and requires one.sig/nodeCategorizes an issue or PR as relevant to SIG Node.

Type

No type

Projects

Status

Done

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions