Skip to content

Conversation

@Rakshith-R
Copy link
Contributor

@Rakshith-R Rakshith-R commented Aug 13, 2025

Describe what this PR does

This commit modifies code to consider image as not syncing when the lastSyncTime is not found.

Currently replication's Resync call fails if the lastSyncTime of an image is not found
when deciding if an image is in syncing state or not.
Instead, we should treat missing lastSyncTime as the image not being
in Syncing state.

Checklist:

  • Commit Message Formatting: Commit titles and messages follow
    guidelines in the developer
    guide
    .
  • Reviewed the developer guide on Submitting a Pull
    Request
  • Pending release
    notes

    updated with breaking and/or notable changes for the next major release.
  • Documentation has been updated, if necessary.
  • Unit tests have been added, if necessary.
  • Integration tests have been added, if necessary.

Show available bot commands

These commands are normally not required, but in case of issues, leave any of
the following bot commands in an otherwise empty comment in this PR:

  • /retest ci/centos/<job-name>: retest the <job-name> after unrelated
    failure (please report the failure too!)

@mergify mergify bot added the component/rbd Issues related to RBD label Aug 13, 2025
@Rakshith-R Rakshith-R added ci/skip/e2e skip running e2e CI jobs ci/skip/multi-arch-build skip building on multiple architectures labels Aug 13, 2025
@Rakshith-R Rakshith-R requested review from a team, Madhu-1 and nixpanic August 13, 2025 14:09
Copy link
Collaborator

@Madhu-1 Madhu-1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

small nit


if req.GetForce() && st.Equal(*creationTime) && !syncInfo.IsSyncing() {
// consider the image is not syncing if either
// - the last sync time was not found or
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please add a comment that when last sync time is not found for example when in split brain state etc. it will help us in future why we used this check.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please add a comment that when last sync time is not found for example when in split brain state etc. it will help us in future why we used this check.

I've added example as when images on both peered clusters are in secondary state.

The actual check is to skip resync if the image is syncing.

The last sync time not found error can be considered equivalent to the image is not in syncing state.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added example as when images on both peered clusters are in secondary state.

I think this is wrong, because if the images are secondary state on both cluster, both will be in up+unknown state (we have check for this even before this state). This is more of an error state when not in healthy secondary state. when both are in secondary state its not possible to resync as there is no primary image.

Copy link
Contributor Author

@Rakshith-R Rakshith-R Aug 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added example as when images on both peered clusters are in secondary state.

I think this is wrong, because if the images are secondary state on both cluster, both will be in up+unknown state (we have check for this even before this state). This is more of an error state when not in healthy secondary state. when both are in secondary state its not possible to resync as there is no primary image.

resync with both images in secondary is a no-op AFAIK.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added example as when images on both peered clusters are in secondary state.

I think this is wrong, because if the images are secondary state on both cluster, both will be in up+unknown state (we have check for this even before this state). This is more of an error state when not in healthy secondary state. when both are in secondary state its not possible to resync as there is no primary image.

The behavior of resync execution when both images are secondary is not changed by this pr.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think you are missing my point, my request is to add a comment not to change any code flow. For example that the last sync might be missing when image is in up+error or split brain state

// consider the image is not syncing if either
// - the last sync time was not found or
// - the sync info indicates the image is not syncing
IsNotSyncing := (errors.Is(sErr, rbderrors.ErrLastSyncTimeNotFound) || !syncInfo.IsSyncing())
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
IsNotSyncing := (errors.Is(sErr, rbderrors.ErrLastSyncTimeNotFound) || !syncInfo.IsSyncing())
isNotSyncing := errors.Is(sErr, rbderrors.ErrLastSyncTimeNotFound) || !syncInfo.IsSyncing()

Madhu-1
Madhu-1 previously approved these changes Aug 13, 2025
Copy link
Collaborator

@Madhu-1 Madhu-1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

its good to add a comment if not possible, this PR need to be revisited later when we get into problem to understand the reason

@nixpanic
Copy link
Member

please describe in the commit message and PR what kind of problem can be seen without this change

This commit modifies code to consider image as not syncing
when the lastSyncTime is not found.

Currently replication's Resync call fails if the
lastSyncTime of an image is not found.
Instead, we should treat missing lastSyncTime as
the image not being in Syncing state.

Signed-off-by: Rakshith R <[email protected]>
@Rakshith-R
Copy link
Contributor Author

Added a lot of comments at various steps
and describe the issue seen without this change.

PTAL

@mergify mergify bot dismissed Madhu-1’s stale review August 13, 2025 15:03

Pull request has been modified.

@Rakshith-R Rakshith-R requested review from a team and Madhu-1 August 13, 2025 15:03
@nixpanic nixpanic requested a review from a team August 13, 2025 15:15
@Madhu-1
Copy link
Collaborator

Madhu-1 commented Aug 13, 2025

@Mergifyio queue

@mergify
Copy link
Contributor

mergify bot commented Aug 13, 2025

queue

✅ The pull request has been merged automatically

The pull request has been merged automatically at fa9e254

@mergify mergify bot merged commit fa9e254 into ceph:devel Aug 13, 2025
15 checks passed
mergify bot added a commit that referenced this pull request Aug 13, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci/skip/e2e skip running e2e CI jobs ci/skip/multi-arch-build skip building on multiple architectures component/rbd Issues related to RBD

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants