Skip to content

Conversation

@hbisheng
Copy link
Member

@hbisheng hbisheng commented Feb 2, 2026

What problem does this PR solve?

Issue Number: Close #10214

What is changed and how does it work?

In the CMEK / encryption key rotation workflow, PD keyspace config is planned to be used to store the current and target encryption config version. Although the workflow is designed to have a single active actor, unexpected concurrent executions (for example, retries or overlapping operations) can lead to config updates overwriting each other and breaking the intended state transition.

This PR adds CAS-style guards to keyspace config updates so that encryption key rotation can advance safely and only when the system is in the expected state. With this, callers can, for example, set next_encryption_file_id only if it is absent, or advance current_encryption_file_id only if it matches next_encryption_file_id.

PD keyspace config updates are extended with optional preconditions that
must be satisfied for the update to succeed. These preconditions act as
lightweight CAS checks.

Changes:
- Extend /pd/api/v2/keyspaces/{name}/config API to accept preconditions
  for CAS-like config updates (equal / absent).
- Add pd-ctl support: keyspace update-config --expect and --expect-absent.

Check List

Tests

  • Unit test
  • Integration test
  • Manual test (add detailed scripts or steps below)
  • No code

Code changes

Side effects

  • Possible performance regression
  • Increased code complexity
  • Breaking backward compatibility

Related changes

Release note

None.

Summary by CodeRabbit

  • New Features

    • Conditional keyspace configuration updates via preconditions and a new API parameter to submit them.
    • CLI flags --expect and --expect-absent to specify preconditions when updating configs.
    • New distinct error for keyspace config precondition failure.
  • Bug Fixes

    • Update flow returns HTTP 409 for precondition violations and transactional conflicts.
  • Tests

    • Added extensive tests (including concurrent scenarios) and a helper to inspect non-OK responses.

@ti-chi-bot
Copy link
Contributor

ti-chi-bot bot commented Feb 2, 2026

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@ti-chi-bot ti-chi-bot bot added do-not-merge/needs-linked-issue release-note-none Denotes a PR that doesn't merit a release note. do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. dco-signoff: yes Indicates the PR's author has signed the dco. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Feb 2, 2026
@coderabbitai
Copy link

coderabbitai bot commented Feb 2, 2026

📝 Walkthrough

Walkthrough

Adds CAS-style preconditions to keyspace config updates: new error, new Manager API UpdateKeyspaceConfigWithPreconditions with transactional precondition checks and mutations, API/CLI plumbing to pass preconditions, and tests (including concurrent scenarios).

Changes

Cohort / File(s) Summary
Error definitions
errors.toml, pkg/errs/errno.go
Add PD:keyspace:ErrKeyspaceConfigPreconditionFailed and exported ErrKeyspaceConfigPreconditionFailed constant for precondition failures.
Core keyspace logic
pkg/keyspace/keyspace.go, pkg/keyspace/keyspace_test.go
Add UpdateKeyspaceConfigWithPreconditions and helpers (checkKeyspaceConfigPreconditions, applyKeyspaceConfigMutations, updateKeyspaceConfigTxn); refactor update flow to use transactional application. Tests validate precondition semantics.
API handler layer
server/apiv2/handlers/keyspace.go, tests/server/apiv2/handlers/keyspace_test.go
Extend UpdateConfigParams with Preconditions field; handler calls new manager API and maps precondition/txn conflicts to HTTP 409. Tests cover correctness and concurrent update races.
Test utilities
tests/server/apiv2/handlers/testutil.go
Add tryUpdateKeyspaceConfig helper to return status/body/meta for PATCH requests so tests can inspect non-OK responses.
CLI tooling & tests
tools/pd-ctl/pdctl/command/keyspace_command.go, tools/pd-ctl/tests/keyspace/keyspace_test.go
Add expect / expect-absent flags parsed into Preconditions, validate duplicates/malformed pairs, and tests exercising precondition scenarios (diff contains a duplicated test insertion).

Sequence Diagram(s)

sequenceDiagram
    participant Client as Client
    participant Handler as API Handler
    participant Manager as Keyspace Manager
    participant Etcd as Etcd Store

    Client->>Handler: PATCH /keyspaces/{name}/config (Config + Preconditions)
    Handler->>Manager: UpdateKeyspaceConfigWithPreconditions(name, mutations, preconditions)
    Manager->>Manager: checkKeyspaceConfigPreconditions(currentConfig, preconditions)
    alt Preconditions match
        Manager->>Manager: build txn with mutations
        Manager->>Etcd: Commit transaction
        Etcd-->>Manager: Success
        Manager-->>Handler: Updated KeyspaceMeta
        Handler-->>Client: 200 OK + new config
    else Preconditions fail
        Manager-->>Handler: ErrKeyspaceConfigPreconditionFailed
        Handler-->>Client: 409 Conflict
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Suggested labels

size/XXL

Suggested reviewers

  • ystaticy
  • rleungx

Poem

🐰 I hop where configs twist and weave,
I check each key before I cleave,
If preconditions fail, I pause — not steal,
Only rightful changes shall I seal,
Hooray — no races on my field!

🚥 Pre-merge checks | ✅ 4 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 12.50% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and concisely summarizes the main change: adding preconditions for keyspace config updates, which aligns perfectly with the PR's primary objective.
Description check ✅ Passed The PR description follows the template with Issue Number (Close #10214), explains the problem, detailed changes, includes commit message block, and covers the checklist with appropriate selections.
Linked Issues check ✅ Passed All coding requirements from issue #10214 are met: CAS-style preconditions (equal/absent checks) for keyspace config updates [#10214], pd-ctl support with --expect flags [#10214], and comprehensive unit tests [#10214].
Out of Scope Changes check ✅ Passed All changes are within scope: error definitions, core precondition logic, transactional updates, API handler modifications, and tooling/tests all directly support the CAS-style preconditions feature objective.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

No actionable comments were generated in the recent review. 🎉

🧹 Recent nitpick comments
pkg/keyspace/keyspace.go (2)

670-687: Consider whether config values in error messages could leak sensitive data.

Line 683 includes both expected and actual config values in the error message. If keyspace config ever carries sensitive values (e.g., encryption-related fields from the CMEK workflow that motivates this PR), these would be surfaced in error responses and logs.

If the config values are guaranteed non-sensitive (e.g., only key IDs, not key material), this is fine. Otherwise, consider omitting actual from the error message or redacting it.


689-701: applyKeyspaceConfigMutations relies on caller to provide a non-nil map.

This is currently safe because updateKeyspaceConfigTxn initializes the config map at line 731 before invoking the closure. However, as a standalone package-level function, a nil config would panic on OpPut (line 693). A brief doc comment noting the non-nil precondition would help future callers.

Warning

Review ran into problems

🔥 Problems

Git: Failed to clone repository. Please run the @coderabbitai full review command to re-trigger a full review. If the issue persists, set path_filters to include or exclude specific files.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@codecov
Copy link

codecov bot commented Feb 2, 2026

Codecov Report

❌ Patch coverage is 75.94937% with 19 lines in your changes missing coverage. Please review.
✅ Project coverage is 78.65%. Comparing base (b668f43) to head (a4f2a67).
⚠️ Report is 6 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master   #10199      +/-   ##
==========================================
+ Coverage   78.59%   78.65%   +0.05%     
==========================================
  Files         520      520              
  Lines       70014    70096      +82     
==========================================
+ Hits        55028    55132     +104     
+ Misses      11008    10993      -15     
+ Partials     3978     3971       -7     
Flag Coverage Δ
unittests 78.65% <75.94%> (+0.05%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Changes:
- Extend /pd/api/v2/keyspaces/{name}/config PATCH with "preconditions" to support CAS-like config updates (equal / absent).
- Add pd-ctl support: keyspace update-config --expect and --expect-absent.

Rationale:
We need PD to provide a simple CAS-style guard for keyspace config updates, so concurrent key-rotation workflows can coordinate safely and only the expected state transition is allowed.

Signed-off-by: Bisheng Huang <hbisheng@gmail.com>
@hbisheng hbisheng marked this pull request as ready for review February 4, 2026 13:34
@ti-chi-bot ti-chi-bot bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Feb 4, 2026
@hbisheng
Copy link
Member Author

hbisheng commented Feb 4, 2026

@coderabbitai review

@coderabbitai
Copy link

coderabbitai bot commented Feb 4, 2026

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
server/apiv2/handlers/keyspace.go (1)

322-368: ⚠️ Potential issue | 🟡 Minor

Return conflicts via errorResp and document HTTP 409.
The new conflict paths should follow the handler convention (errcode + errorResp) and the Swagger annotations should include the 409 response so clients can rely on it.

✏️ Swagger annotation update
 //	`@Failure`	400	{string}	string	"The input is invalid."
+//	`@Failure`	409	{string}	string	"Precondition failed or transaction conflicted."
 //	`@Failure`	500	{string}	string	"PD server failed to proceed the request."

Based on learnings: HTTP handlers in Go must use errcode and errorResp; avoid http.Error.
As per coding guidelines: Swagger specification: run make swagger-spec (with SWAGGER=1) to regenerate; keep annotations current.

@hbisheng
Copy link
Member Author

hbisheng commented Feb 4, 2026

/retest

@rleungx
Copy link
Member

rleungx commented Feb 9, 2026

@ystaticy PTAL

if err := checkKeyspaceConfigPreconditions(meta.GetConfig(), preconditions); err != nil {
return err
}
for _, mutation := range mutations {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about using a function?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added an applyKeyspaceConfigMutations function

@rleungx rleungx requested a review from lhy1024 February 9, 2026 10:28
@ti-chi-bot
Copy link
Contributor

ti-chi-bot bot commented Feb 9, 2026

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign siddontang for approval. For more information see the Code Review Process.
Please ensure that each of them provides their approval before proceeding.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Signed-off-by: Bisheng Huang <hbisheng@gmail.com>
@ti-chi-bot
Copy link
Contributor

ti-chi-bot bot commented Feb 9, 2026

@hbisheng: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-integration-realcluster-test 496b049 link true /test pull-integration-realcluster-test

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

dco-signoff: yes Indicates the PR's author has signed the dco. release-note-none Denotes a PR that doesn't merit a release note. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

PD: Support CAS-style preconditions for keyspace config updates

2 participants