-
Notifications
You must be signed in to change notification settings - Fork 408
Description
In the current implementation of the snapshot controller, in checkAndUpdateSnapshotClass()
if a missing volume snapshot class is detected, an error status is stamped on the volume snapshot object.
Periodic sync, does not clear the error status. Side effect of this is that, even if the volume snapshot class is detected in the subsequent resyncs, syncUnreadySnapshot() never triggers the snapshot content creation object. Because following condition never evaluates to true (snapshot.Status == nil || snapshot.Status.Error == nil || isControllerUpdateFailError(snapshot.Status.Error)), and the volume snapshot workflow is stuck.
Possible fixes:
-
Do not update any error status on the volume object i.e skip calling updateSnapshotErrorStatusWithEvent() from checkAndUpdateSnapshotClass(), and only log an error message. The state machine would fail gracefully while creating a VS content object.
-
Do not update the error status on volume object, but generate an event. (this needs additional changes to ensure that only 1 event is generated, maybe stamp an annotation of missing VSC on the volume object, before generating event)
-
Update the error status as it is done today, but when we detect a VSC in subsequent resync clear the error status (this needs to ensure we check the error msg reason and clear only the VSC missing error status)
-
Update error status for missing VSC as it is done today, but handle this VSC missing error in syncUnreadySnapshot() and proceed with VS content creation. VS content creation would fail gracefully if the VSC is still missing.