add CPU request validation in NRI CreateContainer hook by AutuSnow · Pull Request #110 · kubernetes-sigs/dra-driver-cpu

AutuSnow · 2026-04-06T15:56:35Z

Validation rules:

If container CPU request is specified, it must exactly match claim allocation
Pod-level resources validation (PLR) is a placeholder for future implementation

k8s-ci-robot · 2026-04-06T15:56:43Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: AutuSnow
Once this PR has been reviewed and has the lgtm label, please assign klueska for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

AutuSnow · 2026-04-12T08:31:01Z

/cc @pravk03 @ffromani

ffromani

Thanks! I see the point here and I'm supportive of this change. Mostly improvement suggestions inline.

ffromani · 2026-04-13T11:45:18Z

@@ -0,0 +1,75 @@
+/*
+Copyright The Kubernetes Authors.


do we want to have a pkg/validate or pkg/driver/validate (sub)package, and make this API public?
perhaps it is time to start our own internal hierarchy?

I'm actually asking, I don't have strong objections keeping this code here, except the fact the validation function is correctly private and we should try to not test directly private functions.

Actually, I have been thinking about creating a new level of the project because the 0.1 version was not completed during the development phase, so I still added and modified files on the original level

I think we should maintain the current structure because this is the first verification function, and it is uncertain how much verification logic will be added in the future. If more verifications (such as memory, device, etc.) are added in the future, they can be refactored as pkg/validation/packages (if shared among multiple packages)
Regarding the issue of testing private functions: I believe that for pure logic validation functions, unit testing private functions is reasonable because the validation logic is complex and requires detailed unit testing coverage. E2E testing already covers integration scenarios, and if only public API testing is used, it will make the test cases too complex

We may end up accepting testing of private function but as general rule I want to try hard against testing private functions directly (yes, this means there's some tech debt to clear over time). Each time should be a documented exception, not a habit we gradually develop.
So let's think a bit harder indeed. Perhaps we can turn the return value of parseDRAEnvToClaimAllocations into a proper type and add a Validate method to it with the current logic within?

ffromani · 2026-04-13T11:46:08Z

 		klog.Infof("No guaranteed CPUs found in DRA env for pod %s/%s container %s. Using shared CPUs %s", pod.Namespace, pod.Name, ctr.Name, sharedCPUs.String())
 		adjust.SetLinuxCPUSetCPUs(sharedCPUs.String())
 	} else {
+		// Validate CPU requests match claim allocations


the comment reiterate in english what the pretty explicit code is down a line below, so I'd remove it or repurpose to explain the "why", if it deserves explanation at all

ffromani · 2026-04-13T11:46:50Z

+		// minCPUShares is the Kubernetes minimum for best-effort containers (no CPU request).
+		// Shares == 2 means no explicit CPU request was set; skip validation in that case.
+		const minCPUShares = uint64(2)


this should be a file-level constant, it is important enough.

ffromani · 2026-04-13T11:47:30Z

+			containerCPUShares := ctr.Linux.Resources.Cpu.Shares.Value
+			containerCPURequest := float64(containerCPUShares) / 1024.0
+
+			const tolerance = 0.01


likewise. And how this value was computed? If it's just our first educated guess, fine, but let's explicitly document it.

ffromani · 2026-04-13T11:49:03Z

+	if pod.Linux != nil && pod.Linux.PodResources != nil && pod.Linux.PodResources.Cpu != nil {
+		if pod.Linux.PodResources.Cpu.Shares != nil && pod.Linux.PodResources.Cpu.Shares.Value > 0 {
+			podLevelCPUShares := pod.Linux.PodResources.Cpu.Shares.Value
+			podLevelCPURequest := float64(podLevelCPUShares) / 1024.0
+
+			klog.V(4).InfoS("pod has pod-level CPU request",
+				"namespace", pod.Namespace,
+				"pod", pod.Name,
+				"podLevelCPURequest", podLevelCPURequest,
+				"container", ctr.Name,
+				"claimCPUs", totalClaimCPUs,
+			)
+		}
+	}


If this just logging? If so, why are we doing inside the validation? Should it be its own little function?

Good question! The purpose of this code is to:

Current status: This is a placeholder used to record the existence of Pod Level Resources, in preparation for future PLR (Pod Level Resources) validation

Why is it inside the validation function: Because we have already accessed the resource information of pods and containers here to avoid repeated traversal

Why is it just a log: According to the PR description, PLR validation is "placeholder for future implementation"

ffromani · 2026-04-13T11:50:23Z

+		},
+	}
+
+	for _, tc := range tests {


the code has tests which exercise significant differences, it would be nice to add tests which exercise the smallest difference and any possible edge cases (can't really think of anything atm, but I haven't tried hard enough yet).

ffromani

added missing notes about the e2e tests

ffromani · 2026-04-13T12:46:59Z

+			ginkgo.By("waiting for pod to fail with CreateContainerError")
+			gomega.Eventually(ctx, func(ctx context.Context) (*v1.Pod, error) {
+				return fxt.K8SClientset.CoreV1().Pods(fxt.Namespace.Name).Get(ctx, pod.Name, metav1.GetOptions{})
+			}).
+				WithTimeout(2*time.Minute).
+				WithPolling(5*time.Second).
+				Should(BeFailedToCreate(fxt), "pod should fail to create container")
+		})


the test LGTM, but I wonder if we have a documented API (not the reason/error message text format) to catch this specific rejection

Thank you for your reminder. I understand your concerns, but although CreateContainerError is not formally documented, it is not a formally documented constant in the Kubernetes API. It is a string value dynamically set by kubelet at runtime. But in the Kubernetes ecosystem, it is a factual standard, which is the standard reason value used by kubelets when container creation fails. If a more robust solution is needed, specific error text in the Message field can be checked additionally

if strings.Contains(cntSt.State.Waiting.Message, "CPU request validation failed") { xxxxx }

ffromani · 2026-04-13T12:47:11Z

+			ginkgo.By("waiting for pod to fail with CreateContainerError")
+			gomega.Eventually(ctx, func(ctx context.Context) (*v1.Pod, error) {
+				return fxt.K8SClientset.CoreV1().Pods(fxt.Namespace.Name).Get(ctx, pod.Name, metav1.GetOptions{})
+			}).
+				WithTimeout(2*time.Minute).
+				WithPolling(5*time.Second).
+				Should(BeFailedToCreate(fxt), "pod should fail to create container")


ffromani · 2026-04-13T12:48:17Z

+	ginkgo.Context("without CPU requests specified", func() {
+		ginkgo.It("should successfully create container", func(ctx context.Context) {
+			fxt := rootFxt.WithPrefix("no-request")
+			gomega.Expect(fxt.Setup(ctx)).To(gomega.Succeed())
+			ginkgo.DeferCleanup(fxt.Teardown)
+
+			claimCPUs := int64(1)
+
+			ginkgo.By("creating a ResourceClaim")
+			claim := makeResourceClaim(fxt.Namespace.Name, "test-claim", claimCPUs, cpuDeviceMode)
+			claim, err := fxt.K8SClientset.ResourceV1().ResourceClaims(fxt.Namespace.Name).Create(ctx, claim, metav1.CreateOptions{})
+			gomega.Expect(err).ToNot(gomega.HaveOccurred())
+
+			ginkgo.By("creating a Pod without CPU request")
+			pod := makePodWithClaim(fxt.Namespace.Name, "test-pod", claim.Name, nil, nil)
+			pod, err = fxt.K8SClientset.CoreV1().Pods(fxt.Namespace.Name).Create(ctx, pod, metav1.CreateOptions{})
+			gomega.Expect(err).ToNot(gomega.HaveOccurred())
+
+			ginkgo.By("waiting for pod to be running")
+			err = e2epod.WaitToBeRunning(ctx, fxt.K8SClientset, pod.Namespace, pod.Name)
+			gomega.Expect(err).ToNot(gomega.HaveOccurred(), "pod should be running")


I see why you added this but I'm thinking if this should actually fail - changing the current driver behavior

Thanks for the question about the "without CPU requests specified" test case.
After discussion with @pravk03 , we decided to keep the validation lightweight in the CreateContainer hook:

Validate when CPU request is set: Enforce that container.resources.requests.cpu matches the claim allocation

Skip validation when no CPU request: Allow best-effort containers with claims (shares=2) to pass through

Rationale: Avoid making CreateContainer too heavy. Full PLR validation will be handled by the scheduler (KEP-5517). The driver serves as a "final line of defense" for the most common misconfiguration
(explicit request mismatch).
This approach balances validation coverage with performance. WDYT?

pravk03 · 2026-04-13T17:04:36Z

+	totalClaimCPUs := ca.TotalCPUs()
+
+	if ctr.Linux != nil && ctr.Linux.Resources != nil && ctr.Linux.Resources.Cpu != nil {
+		if ctr.Linux.Resources.Cpu.Shares != nil && ctr.Linux.Resources.Cpu.Shares.Value > minCPUShares {


Lets also document that if shares are not set at container level, validation will pass. This is a valid scenario when pod level resources are set.

pravk03 · 2026-04-13T17:27:52Z

+		}
+	}
+
+	if pod.Linux != nil && pod.Linux.PodResources != nil && pod.Linux.PodResources.Cpu != nil {


I think this is incorrect. From reviewing the kubelet code, this check doesn't strictly confirm that Pod Level Resources (PLR) are enabled. If PLR isn't specified, the kubelet defaults this value to the sum of container requests.

I wonder if there is a way to distinguish explicit PLR set in the pod. If we can, we could add an additional validation step - fail the validation if neither a container-level request nor a pod-level resource is specified.

From the NRI hook, pod.Linux.PodResources is always populated by kubelet — either from explicit PLR or as the sum of container requests (when PLR feature gate is disabled). There's no reliable way to distinguish the two at this layer. The current approach skips validation when container-level shares are not set (shares <= 2), which correctly handles both best-effort containers and PLR scenarios. Full PLR validation is deferred to the scheduler per KEP-5517.Removed the incorrect PLR detection block accordingly.

AutuSnow · 2026-04-14T13:43:58Z

/retest

Signed-off-by: qiuxue <liuyutao36@gmail.com>

AutuSnow · 2026-04-15T15:08:48Z

/test pull-dra-driver-cpu-e2e-device-mode-grouped-arm64
/test pull-dra-driver-cpu-e2e-device-mode-individual-arm64

pravk03 · 2026-04-17T00:29:13Z

I've been thinking a bit more about this. Since KEP-5517 is currently in alpha, I'm concerned that the new validation introduced in this PR might prevent experimentation when the alpha feature gate is enabled.

Initially, I was hoping we could include some validation for Pod Level Resources—specifically checking that the pod-level budget is at least equal to the sum of container-level standard requests + DRA claims. However, during the PR's implementation and review, it has become clear that we can't reliably perform these pod-level validations.

Given this, I'm wondering if we should hold off on adding this validation in code for now, and instead focus on improving our documentation around workload requirements (both with and without KEP-5517)?

@AutuSnow, I know we previously discussed having this validation and I was initially on board with it, but I hadn't fully thought through the KEP-5517 implications at the time. Sorry for the back and forth here!

I would love to hear your thoughts on this @AutuSnow and @ffromani.

/hold

ffromani · 2026-04-22T16:25:24Z

I've been thinking a bit more about this. Since KEP-5517 is currently in alpha, I'm concerned that the new validation introduced in this PR might prevent experimentation when the alpha feature gate is enabled.

Initially, I was hoping we could include some validation for Pod Level Resources—specifically checking that the pod-level budget is at least equal to the sum of container-level standard requests + DRA claims. However, during the PR's implementation and review, it has become clear that we can't reliably perform these pod-level validations.

Given this, I'm wondering if we should hold off on adding this validation in code for now, and instead focus on improving our documentation around workload requirements (both with and without KEP-5517)?

@AutuSnow, I know we previously discussed having this validation and I was initially on board with it, but I hadn't fully thought through the KEP-5517 implications at the time. Sorry for the back and forth here!

I would love to hear your thoughts on this @AutuSnow and @ffromani.

/hold

My very initial thought is that the conflict between validation and KEP-5517 largely depends on the version skew we allow and support. IOW, which versions of the driver is compatible with the kubernetes version?
If this validation would have been merged in time for 0.1.0, the merge process would have been much more straightforward I reckon, because the (implicit) pairing would have been kube 1.35 with the driver 0.1.0.

I think is worth clarifying the interactions we expect. We probably need just a simple table (e.g. kube <= 1.35 -> driver 0.1.0) and/or few versions range.

That said, I'll deep dive in the PR and provide more informed comments.

AutuSnow · 2026-04-24T01:43:23Z

This question has been debated for several days. Can we add the -- enable cpu request validation flag (default true) to be disabled during the KEP-5517 experiment, so as to preserve the safety net of regular scenarios

k8s-ci-robot · 2026-05-05T05:48:11Z

PR needs rebase.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Apr 6, 2026

k8s-ci-robot requested review from johnbelamaric and pravk03 April 6, 2026 15:56

k8s-ci-robot added the size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. label Apr 6, 2026

AutuSnow changed the title ~~add CPU request validation in NRI CreateContainer hook~~ [WIP] add CPU request validation in NRI CreateContainer hook Apr 6, 2026

k8s-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Apr 6, 2026

AutuSnow force-pushed the feat/add_req_validation branch 6 times, most recently from 3f4ac1e to 6564d4f Compare April 12, 2026 08:19

AutuSnow changed the title ~~[WIP] add CPU request validation in NRI CreateContainer hook~~ add CPU request validation in NRI CreateContainer hook Apr 12, 2026

k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Apr 12, 2026

ffromani reviewed Apr 13, 2026

View reviewed changes

AutuSnow force-pushed the feat/add_req_validation branch 2 times, most recently from b4cdc44 to a59ffbd Compare April 13, 2026 15:12

pravk03 reviewed Apr 13, 2026

View reviewed changes

AutuSnow force-pushed the feat/add_req_validation branch 2 times, most recently from e0b4db8 to 96f2585 Compare April 14, 2026 12:59

add CPU request validation in NRI CreateContainer hook

c9ff84b

Signed-off-by: qiuxue <liuyutao36@gmail.com>

AutuSnow force-pushed the feat/add_req_validation branch from 96f2585 to c9ff84b Compare April 14, 2026 14:17

k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Apr 17, 2026

k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label May 5, 2026

Conversation

AutuSnow commented Apr 6, 2026

Uh oh!

k8s-ci-robot commented Apr 6, 2026

Uh oh!

AutuSnow commented Apr 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ffromani left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ffromani left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

AutuSnow commented Apr 14, 2026

Uh oh!

AutuSnow commented Apr 15, 2026

Uh oh!

pravk03 commented Apr 17, 2026

Uh oh!

ffromani commented Apr 22, 2026

Uh oh!

AutuSnow commented Apr 24, 2026

Uh oh!

k8s-ci-robot commented May 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

AutuSnow commented Apr 12, 2026 •

edited

Loading