[VMOwnedVolumes][Draft] State based gaurading and failure recovery handling#4059
Open
deepakkinni wants to merge 5 commits into
Open
[VMOwnedVolumes][Draft] State based gaurading and failure recovery handling#4059deepakkinni wants to merge 5 commits into
deepakkinni wants to merge 5 commits into
Conversation
Signed-off-by: Deepak Kinni <deepak.kinni@broadcom.com>
Signed-off-by: Deepak Kinni <deepak.kinni@broadcom.com>
…pshot Deletion, Revert Re-adoption Signed-off-by: Deepak Kinni <deepak.kinni@broadcom.com>
…ndling Signed-off-by: Deepak Kinni <deepak.kinni@broadcom.com>
Contributor
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: deepakkinni The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Signed-off-by: Deepak Kinni <deepak.kinni@broadcom.com>
Contributor
|
PR needs rebase. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What this PR does / why we need it:
Phase 3 — Workflow C: Detach, CSI Side (Section 9.3)
vCenter Snapshot Tree Query (C.6) — Implement a helper that fetches mo:VirtualMachine { snapshot }, batch-fetches config.hardware.device for all snapshot MoRefs, and builds a set of diskUUIDs still referenced by at least one remaining vCenter snapshot.
Detach Reconciliation: Mark Snapshot-Retained (C.6.a) — When snapshot tree query shows the disk is still referenced, transition CVI to VM_MANAGED + vmName="" (snapshot-retained), patch PVC label to retained-by-snapshot, and remove the volume from BA status.
Detach Reconciliation: Re-Register as FCD (C.7) — When no vCenter snapshot references the disk, reconstruct CNS metadata from PVC/PV/CVI, call CnsCreateVolume to re-register the FCD, transition CVI to CSI_MANAGED, remove cvi-protection finalizer, label PVC csi-owned, and clean up BA status.
Metadata Reconstruction for Re-Registration (Section 9.4) — Implement the metadata reconstruction utility that reads PVC labels/annotations, PV name, and CSI config (ClusterID, SupervisorID) to build the CnsCreateVolume metadata payload used by both C.7 and D.6.
Revert-Induced Drop Handling (C.5/C.6) — Detect volumes with BA condition VolumeDetached=True, reason=DroppedBySnapshotRevert, ensure CVI is set to TRANSFERRING_TO_CSI if not already, and proceed through the same C.6 snapshot-tree-query and branching logic.
Phase 4 — Workflow D: Snapshot Deletion, CSI Phase (Section 10)
VMSnap Watch + CSI Finalizer Trigger (D.3) — Add a watch on VirtualMachineSnapshot CRs; when CSI observes conditions[SnapshotDeleted]=True with its csi.vsphere.vmware.com/snapshot finalizer still present, begin Phase 2 disk re-evaluation.
Per-Disk Retention Evaluation (D.4/D.5) — For each disk in VMSnap.status.disks, look up CVI by cns.vmware.com/disk-uuid label, check the remaining vCenter snapshot tree (reuse helper from task 13), and branch: still-retained (no-op), re-adopted by VM (no-op), or no snapshots remain + vmName="" (proceed to D.6).
Re-Register on Last Snapshot Deletion (D.6) — Same re-registration logic as C.7 (reuse task 15/16) but triggered from Workflow D; transition CVI from VM_MANAGED (snapshot-retained) to CSI_MANAGED, remove cvi-protection finalizer, label PVC csi-owned.
CSI Finalizer Removal from VMSnap (D.7) — After all disks in status.disks are processed, remove the CSI finalizer from the VMSnap CR to allow K8s garbage collection.
Deferred PVC Deletion After D.6 (Section 13.2.2) — After D.6 completes and the PVC has a deletionTimestamp (webhook was bypassed during retention), release the CSI volume-protection finalizer so the standard FCD delete path can proceed.
Which issue this PR fixes (optional, in
fixes #<issue number>(, fixes #<issue_number>, ...)format, will close that issue when PR gets merged): fixes #Testing done:
A PR must be marked "[WIP]", if no test result is provided. A WIP PR won't be reviewed, nor merged.
The requester can determine a sufficient test, e.g. build for a cosmetic change, E2E test in a predeployed setup, etc.
For new features, new tests should be done, in addition to regression tests.
If jtest is used to trigger precheckin tests, paste the result after jtest completes and remove [WIP] in the PR subject.
The review cycle will start, only after "[WIP]" is removed from the PR subject.
Special notes for your reviewer:
Release note: