Skip to content

fix(code-review): strengthen step 1 gating agent reliability#31698

Open
kpatel513 wants to merge 1 commit intoanthropics:mainfrom
kpatel513:fix/kp-gating
Open

fix(code-review): strengthen step 1 gating agent reliability#31698
kpatel513 wants to merge 1 commit intoanthropics:mainfrom
kpatel513:fix/kp-gating

Conversation

@kpatel513
Copy link
Copy Markdown

Summary

Step 1 used a Haiku agent with no defined criteria to decide whether to skip a PR as "trivial". A wrong skip silently drops the entire review with no output. Two issues fixed:

  1. Wrong model: Haiku is the weakest model for a binary decision that gates the entire workflow. Upgraded to Sonnet.
  2. Vague skip criteria: "trivial change that is obviously correct" is undefined — Haiku would interpret this inconsistently. Replaced with explicit criteria.
  3. Fragile "already reviewed" check: "comments left by claude" would match any comment mentioning the word "claude". Changed to look for the ## Code review header that the plugin actually posts.

Changes

plugins/code-review/commands/code-review.md

  • Step 1: Haiku → Sonnet
  • Step 1: Replace vague "trivial" description with explicit skip criteria (lock files, changelogs, generated code, whitespace-only, automated dependency bumps)
  • Step 1: Replace "comments left by claude" with check for ## Code review header

Before / After

Before:

Launch a haiku agent to check if...

  • The pull request does not need code review (e.g. automated PR, trivial change that is obviously correct)
  • Claude has already commented on this PR (check for comments left by claude)

After:

Launch a sonnet agent to check if...

  • The pull request is trivial... A PR is trivial only if it exclusively contains: auto-generated file changes, whitespace/formatting with no logic changes, or automated dependency bumps. If the PR contains any non-trivial changes alongside these, it is not trivial.
  • Claude has already reviewed this PR (check for a comment containing the ## Code review header)

Test plan

  • Run on a Dependabot PR — should be skipped
  • Run on a PR with a Dependabot bump + one logic change — should NOT be skipped
  • Run on a PR already reviewed — should be skipped (check via ## Code review header, not author name)
  • Run on a substantive PR — should proceed to review

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant