Skip to content

already_exists and already_published for CreateVolume and xxxPublishVolume RPCs #158

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 2 commits into from

Conversation

jdef
Copy link
Member

@jdef jdef commented Nov 29, 2017

Rationale for using google.protobuf.BoolValue:

  • default value for bool is false
  • if we used a bool type for this flag then plugins that don't set already_exists at all will end up sending a default false value, even if the volume already existed. this isn't a huge deal from a CO-behavior perspective but it kills the debugging story: what use is a debugging field if you can't rely on it to tell you something meaningful?
  • the default value for BoolValue is unset, because it's a wrapper type. this paves the way for a nice debugging story, wherein plugins that don't set already_exists in a response will not being sending a misleading false value; plugins that explicitly set already_exists in a response will be sending the value that they intend a CO/operator to observe.

James DeFelice added 2 commits November 29, 2017 21:48
CreateVolume, ControllerPublishVolume, and NodePublishVolume
are idempotent RPCs and it MAY be useful for a CO, operator,
or CSI/gRPC intermediary to know when a request was both
successful and did not ultimately change the state of the
system.
@jdef jdef requested review from saad-ali and jieyu November 29, 2017 21:58
@jdef jdef added this to the v0.1 milestone Nov 29, 2017
@jieyu
Copy link
Member

jieyu commented Nov 29, 2017

I think it should either be a required bool or an optional BoolValue. I am leaning toward making it a required field to force the Plugin to set this field properly, and don't give them the option to not set it.

This is already required by the current spec to return ALREADY_EXISTS if a volume already exists. I don't see a reason why we want to make that optional.

@jdef
Copy link
Member Author

jdef commented Dec 1, 2017

I wonder if there are cases where it'll be awkward for a plugin to understand whether a volume already exists (which will be needed if already_exists is a REQUIRED bool). Consider a plugin that interfaces with a cloudy backend that similarly presents an idempotent interface for creating a volume, only that backend interface doesn't return the equivalent of "already exists" -- in order to comply with a REQUIRED bool already_exists the plugin may need to exec additional queries to the backend prior to executing the create op (and even in this type of workaround there are potential races).

If one of our overarching goals is to simplify things for plugin writers, then I'd argue that an OPTIONAL .google.protobuf.BoolValue already_exists probably enables simpler plugin code, at the risk of some plugins being lazy and not implementing support for it when their backend API actually DOES provide a convenient mechanism with which to make a distinction.

Copy link
Member

@jieyu jieyu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I am convinced that using BoolValue is better. Thanks @jdef !

@jdef
Copy link
Member Author

jdef commented Dec 2, 2017

ping @julian-hj @cpuguy83

@cpuguy83
Copy link
Contributor

cpuguy83 commented Dec 4, 2017

I'm confused, why are we using this instead of a status code?

@jdef
Copy link
Member Author

jdef commented Dec 4, 2017 via email

@cpuguy83
Copy link
Contributor

cpuguy83 commented Dec 4, 2017

This seems to muddy the error handling quite a lot.

To get around the various issues with intermediate layers, could we specify that an ALREADY_EXISTS response needs to have encoded in the details the volume ID, and then have a GetVolumeInfo rpc to subsequently have it return the other missing data? I seem to recall we used to have a similar rpc... likely removed when we got rid of the "VolumeHandle"?

Copy link
Contributor

@julian-hj julian-hj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@akutz
Copy link
Contributor

akutz commented Dec 4, 2017

Hi @cpuguy83,

I agree with you completely. Julian was adamant that the error should not contain any information other than the error due to concerns around logging of sensitive details (despite the fact that no RPC current returns sensitive information as part of the official model).

I was frankly tired of arguing for the idiomatic approach to this issue and just consented to the compromise.

I do not care for it, and this is the reason gRPC errors can marshal protobuf objects into the error details, but I was the only person on the call defending the idiomatic approach.

@cpuguy83
Copy link
Contributor

cpuguy83 commented Dec 4, 2017

In this case, volume ID's should not be sensitive and should be expected to log.

So just to be clear, I'm 👎 on this change. It's making some really nasty compromises on the interface before we even do the first pre-release.

@akutz
Copy link
Contributor

akutz commented Dec 4, 2017

Hi @cpuguy83,

I believe that Julian was referring to the possibility that a CreateVolumeResponse might one day include credentials and/or VolumeAttributes may or may not include sensitive information.

However, neither case is specified in the spec today and as such I agree that the current approach is idiosyncratic with respect to how gRPC intends errors and error details to be returned.

@jieyu
Copy link
Member

jieyu commented Dec 4, 2017

@cpuguy83, can you sync with @julian-hj on this? Honestly, I am fine with either approach, but I don't want this to block the release.

@julian-hj
Copy link
Contributor

I guess I am stuck on the notion that returning an error as a second response to a supposedly idempotent function is idiomatic. I find that hard to believe. Can you guys point to examples of functions that claim to be idempotent, and return data payloads in the error details?

@jieyu
Copy link
Member

jieyu commented Dec 4, 2017

@akutz @cpuguy83 will you guys be ok if CreateVolume returns an OK if the volume already exists? Sounds like the tradeoff of this approach is debugging. Maybe this is ok given all the issues discussed? cc @saad-ali @jdef @julian-hj

@gpaul
Copy link

gpaul commented Dec 5, 2017

👎 on adding the field. It is optional and therefore cannot be relied upon by the CO. We lose referential transparency (a precondition for idempotency) making the API awkward to satisfy and consume. The benefit does not outweigh the cost.

I like returning OK iff. all request parameters are met by the existing volume and would like to keep ALREADY_EXISTS as a failure for the case where the named volume exists but does not satisfy the request.

@gpaul
Copy link

gpaul commented Dec 5, 2017

Let's err on the side of leaving out fields that can always be added for v0.2. Ripping them out for v0.2 is much harder.

@cpuguy83
Copy link
Contributor

cpuguy83 commented Dec 5, 2017

I think it's fair to have the SP return an OK response when it already exists assuming the SP has determined that it is the same volume and not just some other volume with the same name.

If we want to aid debugging, we could come up with a proto to deal with this to embed in the status details (even for OK responses... I think this should be fine...), but I'm not sure this should be a requirement for v0.1.

@jieyu
Copy link
Member

jieyu commented Dec 5, 2017

@akutz @julian-hj @saad-ali, ping? will you be OK if the SP return an OK response when the volume already exists?

@akutz
Copy link
Contributor

akutz commented Dec 5, 2017

Hi @jieyu,

Not really. This is the point of the gRPC error details. :(

I can live with returning a successful result IFF we also declare that an SP may enable debug mode and return ALREADY_EXISTS or NOT_FOUND for CreateVolume and DeleteVolume, indicating to the CO that the object (in the case of the CreateVolume) already exists and is included in the error details.

In other words, returning OK is the default mode, but COs should also still be aware that an SP MAY return an idempotent error with an embedded object.

@julian-hj
Copy link
Contributor

I would argue that “idempotent error” is an oxymoron. If calling the method a second time with the same arguments is an error, then the method is not idempotent.

So I think we have to decide if we want to return an error, or we want to claim idempotency.

For my part, I am fine with Jie’s suggestion. If we really believe that the methods are idempotent, then an ALREADY_EXISTS status is pretty much useless anyway.

I am going to go offline for the rest of the day because I am in jury duty. Apologies for any delayed responses.

jieyu added a commit to jieyu/csi-spec that referenced this pull request Dec 7, 2017
The patch fixed the error codes that are related to idempotency:

(1) For `CreateVolume`, if the volume already exists and is compatible,
return OK instead. If the volume exists but not compatible, return
ALREADY_EXISTS.

(2) For `DeleteVolume`, if the volume does not exist, return OK instead.

(3) For `ControllerUnpublishVolume`, if the volume is already detached
from the node, return OK instead.

Fixes container-storage-interface#157
xref container-storage-interface#158
@jieyu
Copy link
Member

jieyu commented Dec 7, 2017

I created #160 based on the conversion above. PTAL

@jieyu
Copy link
Member

jieyu commented Dec 19, 2017

Close this one as #160 has landed.

@jieyu jieyu closed this Dec 19, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants