Skip to content

Error handling for Idempotency. #24

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
chhsia0 opened this issue May 27, 2017 · 7 comments
Closed

Error handling for Idempotency. #24

chhsia0 opened this issue May 27, 2017 · 7 comments

Comments

@chhsia0
Copy link
Contributor

chhsia0 commented May 27, 2017

CreateVolume: When a volume with the specified name exists, do we return a success or a VOLUME_ALREADY_EXISTS error? If the latter, should we add a VolumeInfo in CreateVolumeError?

DeleteVolume: When the volume does not exist, do we return a success or a VOLUME_DOES_NOT_EXIST error?

ControllerPublishVolume: When a volume is published to the same node, do we return a success or an error? If the latter, what error code should we use, and should we add a PublishVolumeInfo into ControllerPublishVolumeError?

ControllerUnpublishVolume: When a volume is already unpublished from the same node, do we return a success or an error? If the latter, what error code should we use?

NodePublishVolume: When a volume is publish to the same path, do we return a success or an error? If the latter, what error code should we use?

NodeUnpublishVolume: When a volume is unpublished from the same path, do we return a success or an error? If the latter, what error code should we use?

@jdef
Copy link
Member

jdef commented Jun 1, 2017

CreateVolume: It's arguable that in the interest of idempotency this RPC should return success if the volume has already been created. However, in such cases the CO may be interested in whether the volume already existed, or has been created from scratch. A plugin MAY NOT implement support for the ListVolumes RPC, and so in such cases it would seem difficult for a CO to reconcile its copy of VolumeInfo with that of the plugin unless that VolumeInfo was coded as part of the error message. OTOH a plugin MAY NOT return this error code (in the presence of a pre-existing volume) and instead simply return "success", which includes a VolumeInfo in the reply - but that doesn't inform the CO about whether the volume was pre-existing. Ultimately, there's no way for a plugin to return "this volume already exists, and here's the VolumeInfo for it" without implementing ListVolumes. Furthermore, there's no reliable way for a CO to know which semantic a plugin has implemented. This smells like something that we should tighten up somehow.

DeleteVolume: the recovery behavior for VOLUME_DOES_NOT_EXIST states that a CO SHOULD assume "success" in the presence of this error. How can this be further clarified?

ControllerPublishVolume: it's idempotent; multiple publish calls w/ the same NodeID should not generate an error, with the caveat that the same nodes are members of the cluster for all such calls. for example, if a node goes missing in between calls to this RPC then even through an initial call succeeds, a later call may generate an error with the code NODE_DOES_NOT_EXIST.

ControllerUnpublishVolume: same as the comment for ControllerPublishVolume

NodePublishVolume: it's idempotent, so in general I'd expect redundant calls w/ identical parameters to succeed

NodeUnpublishVolume: same as the above comment for NodePublishVolume

@jdef
Copy link
Member

jdef commented Jun 1, 2017

/cc @jieyu @saad-ali re: CreateVolume RPC

@chhsia0
Copy link
Contributor Author

chhsia0 commented Jun 1, 2017

It seems to me that for idempotency, returning a success is simpler for CO, then I would suggest that we can remove CreateVolumeError::VOLUME_ALREADY_EXISTS and DeleteVolumeError::VOLUME_DOES_NOT_EXIST, and explicitly state that a plugin should always return success for redundant idempotent calls. However, if we want CO to distinguish the redundant calls, then I'd prefer that every call should return an error for consistency.

@julian-hj
Copy link
Contributor

IMO HTTP has a pretty nice pattern for this. Instead of treating redundant creation as an error, it treats it as a different kind of success. Create calls that actually create something return 201 CREATED whereas calls to create something that already exists return 200 OK.

In general it strikes me that most of the error code discussions we are having are heading down the path of reinventing the basic CRUD and networking status codes that are baked into HTTP, but maybe now that the pendulum has swung fully back to RPC interfaces, that's heresy?

@cpuguy83
Copy link
Contributor

cpuguy83 commented Jun 3, 2017

@julian-hj grpc has the same kind of mechanisms, including an AlreadyExists status code.
I like getting something like AlreadyExists back, as it lets the CO determine how to handle the case.

@jdef
Copy link
Member

jdef commented Dec 19, 2017

#160 has landed. can this be closed?

@jdef
Copy link
Member

jdef commented Dec 19, 2017

nevermind, I'm just closing this. If that's controversial then please re-open

@jdef jdef closed this as completed Dec 19, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants