Skip to content

Conversation

@black-dragon74
Copy link
Member

Describe what this PR does

This patch adds the functionality to map the k8s volumesnapshots to the cephfs/rbd snapshots.

This patch also adds a wrapper around oc/kubectl called kube_client which will help get rid of code duplication.

Closes: #3344

Future concerns

I will follow up the refactors around kube_client in a separate PR

@black-dragon74
Copy link
Member Author

black-dragon74 commented Jan 2, 2025

Testing

Script output

❯ python3 troubleshooting/tools/tracevol.py -k ~/.kube/config -c kubectl -rn rook-ceph -n rook-ceph -cm rook-ceph-csi-config -cmn rook-ceph -d True
+------------------------------------------------------------------------------------------------------------------------------------------------------------+
|                                                                            RBD                                                                             |
+----------+------------------------------------------+----------------------------------------------+-----------------+------------------+------------------+
| PVC Name |                 PV Name                  |                  Image Name                  | PV name in omap | Image ID in omap | Image in cluster |
+----------+------------------------------------------+----------------------------------------------+-----------------+------------------+------------------+
| rbd-pvc  | pvc-234b935e-b3e8-4aeb-8369-f14d8f7ef87e | csi-vol-8cf8a069-6c2f-4c5a-b4db-2c4f54457009 |       True      |       True       |       True       |
+----------+------------------------------------------+----------------------------------------------+-----------------+------------------+------------------+
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|                                                                                    CephFS                                                                                    |
+--------------------+------------------------------------------+----------------------------------------------+-----------------+----------------------+----------------------+
|      PVC Name      |                 PV Name                  |                Subvolume Name                | PV name in omap | Subvolume ID in omap | Subvolume in cluster |
+--------------------+------------------------------------------+----------------------------------------------+-----------------+----------------------+----------------------+
|     cephfs-pvc     | pvc-d3f5902e-c8aa-4651-96c7-a6b8d00af97a | csi-vol-238df571-8de6-4a71-8795-3e2b9cab3b1a |       True      |         True         |         True         |
|  cephfs-pvc-clone  | pvc-28a47204-b880-43ef-aa1c-574a8ccd5cd5 | csi-vol-8325f40c-91a3-4a44-a3cd-84f0b1b8c664 |       True      |         True         |         True         |
| cephfs-pvc-restore | pvc-9f8acd0a-322c-4188-b02a-a37e26db0db8 | csi-vol-083f7338-00e7-4d1b-ac50-b01222130659 |       True      |         True         |         True         |
+--------------------+------------------------------------------+----------------------------------------------+-----------------+----------------------+----------------------+
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|                                                                           RBD Volume Snapshots                                                                          |
+--------------------+---------+----------------------------------------------+-----------------------------------------------+---------------------+---------------------+
|        Name        |   PVC   |                  Image Name                  |                  Snapshot ID                  | Snapshot ID in omap | Snapcontent in omap |
+--------------------+---------+----------------------------------------------+-----------------------------------------------+---------------------+---------------------+
|  rbd-pvc-snapshot  | rbd-pvc | csi-vol-8cf8a069-6c2f-4c5a-b4db-2c4f54457009 | csi-snap-657a613f-66c4-4c3c-8de4-28cf892c6aff |         True        |         True        |
| rbd-pvc-snapshot-1 | rbd-pvc | csi-vol-8cf8a069-6c2f-4c5a-b4db-2c4f54457009 | csi-snap-4114df25-3120-4544-8a55-a7fde19f67c9 |         True        |         True        |
+--------------------+---------+----------------------------------------------+-----------------------------------------------+---------------------+---------------------+
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|                                                                            CephFS Volume Snapshots                                                                            |
+-----------------------+------------+----------------------------------------------+-----------------------------------------------+---------------------+---------------------+
|          Name         |    PVC     |                Subvolume Name                |                  Snapshot ID                  | Snapshot ID in omap | Snapcontent in omap |
+-----------------------+------------+----------------------------------------------+-----------------------------------------------+---------------------+---------------------+
|  cephfs-pvc-snapshot  | cephfs-pvc | csi-vol-238df571-8de6-4a71-8795-3e2b9cab3b1a | csi-snap-9cbfd4c7-d448-4ea2-b2bf-087c7b3cb418 |         True        |         True        |
| cephfs-pvc-snapshot-1 | cephfs-pvc | csi-vol-238df571-8de6-4a71-8795-3e2b9cab3b1a | csi-snap-2470db35-8f8a-4100-826d-9d9ce46b1bf3 |         True        |         True        |
+-----------------------+------------+----------------------------------------------+-----------------------------------------------+---------------------+---------------------+

RBD snapshot inside ceph

❯ oc exec -it rook-ceph-tools-68bf47bc65-486jp -- rbd children replicapool/csi-vol-ec58aaf0-2f28-45fc-95f8-67775e0be615
replicapool/csi-snap-dab553a8-9184-45ab-ba43-dde720fa45ea

CephFS snapshot inside ceph

❯ oc exec -it rook-ceph-tools-68bf47bc65-486jp -- ceph fs subvolume snapshot ls myfs csi-vol-595c630d-6e17-4c00-a66e-91785fb01c6d csi
[
    {
        "name": "csi-snap-7a9fa01e-8577-4829-bb30-73b0ac94b534"
    }
]

Regards

@black-dragon74 black-dragon74 requested a review from a team January 2, 2025 11:52
@Madhu-1
Copy link
Collaborator

Madhu-1 commented Jan 3, 2025

+---------------------------------------------------------------------------+------------+----------+----------------------------------------------+-----------------------------------------------+
| Name | PVC | PVC Type | Subvolume Name | Snapshot ID |
+---------------------------------------------------------------------------+------------+----------+----------------------------------------------+-----------------------------------------------+
| rbd-pvc-snapshot | rbd-pvc | RBD | csi-vol-ec58aaf0-2f28-45fc-95f8-67775e0be615 | csi-snap-dab553a8-9184-45ab-ba43-dde720fa45ea |
| snapshot-f64ca5dd7f01b816f689a0c44423338452e82ae57498cfd449949c1dc399fb68 | cephfs-pvc | CephFS | csi-vol-595c630d-6e17-4c00-a66e-91785fb01c6d | csi-snap-7a9fa01e-8577-4829-bb30-73b0ac94b534 |
+---------------------------------------------------------------------------+------------+----------+----------------------------------------------+-----------------------------------------------+

can we have different table for cephfs and rbd just like we have for PVC? Subvolume Name is cephfs specific, if the type is RBD it doesn't make sense to print rbd details under subvolume name

@nixpanic nixpanic added ci/skip/e2e skip running e2e CI jobs ci/skip/multi-arch-build skip building on multiple architectures labels Jan 6, 2025
@black-dragon74
Copy link
Member Author

can we have different table for cephfs and rbd just like we have for PVC? Subvolume Name is cephfs specific, if the type is RBD it doesn't make sense to print rbd details under subvolume name

Done. I have updated the test results with the most recent output :)

@black-dragon74 black-dragon74 requested a review from Madhu-1 January 6, 2025 10:29
pv_data = kube_client("get", "pv", pvc['spec']['volumeName'])

# Find the type of the PVC (RBD or CephFS)
fs_type = "CephFS" if "fsName" in pv_data['spec']['csi']['volumeAttributes'] else "RBD"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what happens if there are non-csi pvc or pvc's belongs to other storage provider?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated it like: If the encountered PVC is not a CSI volume, we skip it.

belongs to other storage provider?

The existing checks in the script only contain cases for either CephFS or RBD. I can refactor the detection logic to be more precise in the upcoming PR?

# Alternatively, we can use the toolbox pod to fetch these values from ceph directly
snap_content_name = snapshot['status']['boundVolumeSnapshotContentName']
snap_content = kube_client("get", "volumesnapshotcontent", snap_content_name)
snap_id = "csi-snap-" + get_image_uuid(snap_content['status']['snapshotHandle'])
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

csi-snap- is default but you can change the snap prefix in the volumesnapshotclass

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

@Madhu-1
Copy link
Collaborator

Madhu-1 commented Jan 6, 2025

what about the column for omap and if the snapshots is present on the cluster or not?

@black-dragon74
Copy link
Member Author

what about the column for omap and if the snapshots is present on the cluster or not?

Since the backing subvolume/image inside ceph is same to that of the PVC, is it necessary to duplicate that info?

# The snapshot prefix can be optionally defined in the volumesnapshotclass paramaters
snap_handle = snap_content['status']['snapshotHandle']
snap_uuid = get_image_uuid(snap_handle)
snap_prefix = snap_class['parameters'].get('snapshotNamePrefix') or 'csi-snap-'
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we need to make it more flexible to fetch it from omap rather from the snap class as snap class can be deleted post creation and recreated post snapshot creation with different prefix.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ack. On it.

@black-dragon74 black-dragon74 force-pushed the trace-add-vs branch 2 times, most recently from e6bb877 to c2adb42 Compare January 7, 2025 15:28
@black-dragon74 black-dragon74 requested a review from Madhu-1 January 7, 2025 15:30
Retrieve and parse the list of volume snapshots from the cluster.
"""
if ARGS.namespace:
volumesnapshots = kube_client("get", "volumesnapshots", "-n", ARGS.namespace)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this required? why not directly use ARGS.namespace in kube_client method?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

kube_client is supposed to be a simple wrapper around kubectl or oc. Doing it like this allows us to fetch both namespaced and clusterscoped resources and also keep the syntax similar to kc/oc.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then why is this code in kube_client? mistake?

    cmd += ["--namespace", arg.namespace]t
    cmd += list(commands)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whoops, missed it. Thank you!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if ARGS.namespace:
   volumesnapshots = kube_client("get", "volumesnapshots", "-n", ARGS.namespace)

🤔 But still this conditioning will not work as per your requirement.
if ARGS.namespace is not provided, default value will be default namespace and if condition will always be true?

@black-dragon74 black-dragon74 force-pushed the trace-add-vs branch 2 times, most recently from 295bfdb to b9f9e8b Compare January 9, 2025 10:06
@Madhu-1
Copy link
Collaborator

Madhu-1 commented Jan 27, 2025

@Mergifyio queue

@mergify
Copy link
Contributor

mergify bot commented Jan 27, 2025

queue

✅ The pull request has been merged automatically

The pull request has been merged automatically at ec5fefc

This patch adds the functionality to map the k8s volumesnapshots
to the cephfs/rbd snapshots.

This patch also adds a wrapper around oc/kubectl called `kube_client`
which will help get rid of code duplication.

Closes: ceph#3344

Signed-off-by: Niraj Yadav <[email protected]>
@mergify mergify bot added the ok-to-test Label to trigger E2E tests label Jan 27, 2025
@ceph-csi-bot
Copy link
Collaborator

/test ci/centos/k8s-e2e-external-storage/1.30

@ceph-csi-bot
Copy link
Collaborator

/test ci/centos/upgrade-tests-cephfs

@ceph-csi-bot
Copy link
Collaborator

/test ci/centos/mini-e2e-helm/k8s-1.30

@ceph-csi-bot
Copy link
Collaborator

/test ci/centos/upgrade-tests-rbd

@ceph-csi-bot
Copy link
Collaborator

/test ci/centos/mini-e2e/k8s-1.30

@ceph-csi-bot
Copy link
Collaborator

/test ci/centos/k8s-e2e-external-storage/1.32

@ceph-csi-bot
Copy link
Collaborator

/test ci/centos/k8s-e2e-external-storage/1.31

@ceph-csi-bot
Copy link
Collaborator

/test ci/centos/mini-e2e-helm/k8s-1.32

@ceph-csi-bot
Copy link
Collaborator

/test ci/centos/mini-e2e-helm/k8s-1.31

@ceph-csi-bot
Copy link
Collaborator

/test ci/centos/mini-e2e/k8s-1.32

@ceph-csi-bot
Copy link
Collaborator

/test ci/centos/mini-e2e/k8s-1.31

@ceph-csi-bot ceph-csi-bot removed the ok-to-test Label to trigger E2E tests label Jan 27, 2025
@mergify mergify bot merged commit ec5fefc into ceph:devel Jan 27, 2025
37 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci/skip/e2e skip running e2e CI jobs ci/skip/multi-arch-build skip building on multiple architectures

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Enhance trace.py to work with volumesnapshots

5 participants