Skip to content

eval: Add fix-rbac-wrong-resource eval#337

Merged
droot merged 1 commit into
GoogleCloudPlatform:mainfrom
noahlwest:fix-rbac-resource
Jun 13, 2025
Merged

eval: Add fix-rbac-wrong-resource eval#337
droot merged 1 commit into
GoogleCloudPlatform:mainfrom
noahlwest:fix-rbac-resource

Conversation

@noahlwest
Copy link
Copy Markdown
Collaborator

Adds fix-rbac-wrong-resource eval. The goal of this eval is to cover the case of the agent fixing a misconfigured RBAC resource. In this case, the permissions are granted on the wrong resource type.
-verify checks for expected and denied permissions

@mikebz mikebz requested a review from Copilot June 13, 2025 00:58
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR introduces an evaluation to verify that a misconfigured RBAC resource is fixed correctly by ensuring that the service account only has permissions to list pods in its designated namespace while denying global pod listing.

  • A new verification script (verify.sh) was created to enforce the permission checks.
  • The setup script (setup.sh) simulates the misconfigured RBAC by assigning incorrect permissions.
  • Accompanying task metadata (task.yaml) and cleanup script (cleanup.sh) support the eval flow.

Reviewed Changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated no comments.

File Description
k8s-bench/tasks/fix-rbac-wrong-resource/verify.sh Verifies that the fixed RBAC permissions allow listing pods in the proper namespace and reject global access.
k8s-bench/tasks/fix-rbac-wrong-resource/task.yaml Defines the eval task details including prompt and script references.
k8s-bench/tasks/fix-rbac-wrong-resource/setup.sh Sets up a namespace and a misconfigured RBAC role that lists deployments instead of pods.
k8s-bench/tasks/fix-rbac-wrong-resource/cleanup.sh Provides cleanup by deleting the created namespace.

@mikebz
Copy link
Copy Markdown
Collaborator

mikebz commented Jun 13, 2025

Suggestion - maybe output what this benchmark looks like when you run it?

@droot
Copy link
Copy Markdown
Member

droot commented Jun 13, 2025

Suggestion - maybe output what this benchmark looks like when you run it?

Good idea! @noahlwest we should include this as a step in the eval contribution guide you are working on

@droot droot merged commit 3685f54 into GoogleCloudPlatform:main Jun 13, 2025
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants