Skip to content

Conversation

@lhy1024
Copy link
Contributor

@lhy1024 lhy1024 commented Dec 17, 2025

What problem does this PR solve?

Issue Number: Ref #9764

What is changed and how does it work?

Check List

Tests

  • Unit test

Release note

None.

Signed-off-by: lhy1024 <admin@liudos.us>
Signed-off-by: lhy1024 <admin@liudos.us>
Signed-off-by: lhy1024 <admin@liudos.us>
Signed-off-by: lhy1024 <admin@liudos.us>
@ti-chi-bot ti-chi-bot bot added release-note-none Denotes a PR that doesn't merit a release note. dco-signoff: yes Indicates the PR's author has signed the dco. needs-cherry-pick-release-8.5 Should cherry pick this PR to release-8.5 branch. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Dec 17, 2025
@codecov
Copy link

codecov bot commented Dec 17, 2025

Codecov Report

❌ Patch coverage is 88.23529% with 4 lines in your changes missing coverage. Please review.
✅ Project coverage is 78.49%. Comparing base (35b9458) to head (6e50d3d).
⚠️ Report is 2 commits behind head on master.

Additional details and impacted files
@@           Coverage Diff           @@
##           master   #10080   +/-   ##
=======================================
  Coverage   78.49%   78.49%           
=======================================
  Files         515      515           
  Lines       69236    69253   +17     
=======================================
+ Hits        54344    54358   +14     
- Misses      10948    10957    +9     
+ Partials     3944     3938    -6     
Flag Coverage Δ
unittests 78.49% <88.23%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@lhy1024
Copy link
Contributor Author

lhy1024 commented Dec 17, 2025

/retest

1 similar comment
@lhy1024
Copy link
Contributor Author

lhy1024 commented Dec 17, 2025

/retest

// Check if region is in an affinity group that doesn't allow regular scheduling
if !r.affinityFilter.Select(region).IsOK() {
scatterSkipAffinityCounter.Inc()
return nil, errors.Errorf("region %d is in affinity group", region.GetID())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will it trigger the retry?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it should be the same logic with hot region or no leader?
{93998DBC-280B-414B-A779-533C4C9BC672}

Signed-off-by: lhy1024 <admin@liudos.us>
@ti-chi-bot ti-chi-bot bot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Dec 17, 2025
Signed-off-by: lhy1024 <admin@liudos.us>
@ti-chi-bot ti-chi-bot bot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Dec 17, 2025
@lhy1024
Copy link
Contributor Author

lhy1024 commented Dec 18, 2025

@bufferflies @rleungx PTAL

@lhy1024
Copy link
Contributor Author

lhy1024 commented Dec 18, 2025

/retest

Signed-off-by: lhy1024 <admin@liudos.us>
Signed-off-by: lhy1024 <admin@liudos.us>
@lhy1024
Copy link
Contributor Author

lhy1024 commented Dec 18, 2025

/retest

@lhy1024
Copy link
Contributor Author

lhy1024 commented Dec 18, 2025

/retest

1 similar comment
@lhy1024
Copy link
Contributor Author

lhy1024 commented Dec 18, 2025

/retest


// GetRegionAffinityGroupState returns the affinity group state and isAffinity for a region.
func (m *Manager) GetRegionAffinityGroupState(region *core.RegionInfo) (group *GroupState, isAffinity bool) {
// If skipSaveCache is not set to true, InvalidCache must be called at the appropriate time to prevent stale cache entries.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about adding comments about when we need to update the cache?

Signed-off-by: lhy1024 <admin@liudos.us>
Copy link
Contributor

@bufferflies bufferflies left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rest lgtm

}

ops, err := operator.CreateMergeRegionOperator("admin-merge-region", c, region, target, operator.OpAdmin|operator.OpMerge)
ops, err := operator.CreateMergeRegionOperator("admin-merge-region", c, region, target, operator.OpAdmin)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why remove the OpAdmin type?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


// A Region may no longer exist in the RegionTree due to a merge.
// In this case, clear the cache in affinity manager for that Region and skip processing it.
if c.cluster.GetRegion(region.GetID()) == nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can it be put in line 87?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cannot move the check before GetAndCacheRegionAffinityGroupState.

Problem with moving it earlier:

  • If check passes, then region is deleted, then HandleOverlaps calls InvalidCache (but cache doesn't
    exist yet - no-op)
  • Then we save cache → stale cache leak (no cleanup path)

Current approach guarantees:

  • Save cache first, then check
  • Either HandleOverlaps (when region deleted) OR AffinityChecker (when check fails) will clean up
  • At least one cleanup point always executes

@lhy1024
Copy link
Contributor Author

lhy1024 commented Dec 18, 2025

@bufferflies @rleungx PTAL

Signed-off-by: lhy1024 <admin@liudos.us>
@ti-chi-bot ti-chi-bot bot added needs-1-more-lgtm Indicates a PR needs 1 more LGTM. approved labels Dec 18, 2025
@lhy1024
Copy link
Contributor Author

lhy1024 commented Dec 18, 2025

/retest

@ti-chi-bot
Copy link
Contributor

ti-chi-bot bot commented Dec 18, 2025

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: bufferflies, rleungx

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:
  • OWNERS [bufferflies,rleungx]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@ti-chi-bot ti-chi-bot bot added lgtm and removed needs-1-more-lgtm Indicates a PR needs 1 more LGTM. labels Dec 18, 2025
@ti-chi-bot
Copy link
Contributor

ti-chi-bot bot commented Dec 18, 2025

[LGTM Timeline notifier]

Timeline:

  • 2025-12-18 08:02:08.985471197 +0000 UTC m=+1719273.799248759: ☑️ agreed by rleungx.
  • 2025-12-18 08:29:13.735428916 +0000 UTC m=+1720898.549206488: ☑️ agreed by bufferflies.

@ti-chi-bot ti-chi-bot bot merged commit b53de7a into tikv:master Dec 18, 2025
31 checks passed
ti-chi-bot pushed a commit to ti-chi-bot/pd that referenced this pull request Dec 18, 2025
ref tikv#9764

Signed-off-by: ti-chi-bot <ti-community-prow-bot@tidb.io>
@ti-chi-bot
Copy link
Member

In response to a cherrypick label: new pull request created to branch release-8.5: #10088.
But this PR has conflicts, please resolve them!

lhy1024 added a commit to lhy1024/pd that referenced this pull request Dec 18, 2025
HunDunDM pushed a commit to HunDunDM/pd that referenced this pull request Dec 18, 2025
…c miss (tikv#10080)

ref tikv#9764

Signed-off-by: lhy1024 <admin@liudos.us>
Signed-off-by: HunDunDM <hundundm@gmail.com>
# Conflicts:
#	pkg/schedule/operator/operator_controller.go
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved dco-signoff: yes Indicates the PR's author has signed the dco. lgtm needs-cherry-pick-release-8.5 Should cherry pick this PR to release-8.5 branch. release-note-none Denotes a PR that doesn't merit a release note. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants