Skip to content

Reconcile idempotency requirements with VOLUME_DOES_NOT_EXIST and VOLUME_ALREADY_EXISTS error codes #100

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
saad-ali opened this issue Sep 6, 2017 · 9 comments
Milestone

Comments

@saad-ali
Copy link
Member

saad-ali commented Sep 6, 2017

@akutz pointed out:

Errors such as VOLUME_DOES_NOT_EXIST seem to contradict statements such as This operation MUST be idempotent.
The simplest example is DeleteVolume (https://github.com/container-storage-interface/spec/blob/master/spec.md#deletevolume):

This operation MUST be idempotent. This operation SHOULD be best effort in the sense that if the Plugin is certain that the volume as well as the artifacts associated with the volume do not exist anymore, it SHOULD return a success.

And yet DeleteVolume defines the following error:

      // `VolumeID` does not exist.
      //
      // Recovery behavior: Caller SHOULD assume the `DeleteVolume` call
      // succeeded.
      VOLUME_DOES_NOT_EXIST = 5;

Most other volume-related operations have a similar seeming contradiction.

We debated offline weather we should remove the VOLUME_DOES_NOT_EXIST and VOLUME_ALREADY_EXISTS error codes altogether and require the associated calls to succeed in these cases, or continue to require the error codes and clarify the idempotency clauses to indicate that while idempotency is expected, for these specific cases the plugin must return these error codes.

We agreed that the later would is preferred because it improves debuggability. Knowing exactly what happened (even if the only thing CO does differently is logging) is better than having to guess what "actually happened" when things go wrong (for example, CO has a bug that somehow calls unpublish on the wrong volume, one that is not yet mounted). For consistency amongst the interfaces we will keep all instances of the VOLUME_DOES_NOT_EXIST and VOLUME_ALREADY_EXISTS error codes, but will modify the idempotency clauses with clarifications to address the contradictions.

This issue is to track that work.

CC @jdef @jieyu @akutz

@julian-hj
Copy link
Contributor

related to #18

If we continue to require the error to be returned, then I think we need to change the way errors are returned, as the current spec doesn't allow an error and a response message at the same time.

I'm pretty sure we were gonna do that anyway, but it's worth tracking in this context.

@akutz
Copy link
Contributor

akutz commented Sep 6, 2017

Hi @julian-hj,

the current spec doesn't allow an error and a response message at the same time.

Are you sure? My work on GoCSI returns both an error code and message. Every CSI error has an error code and associated description string that can be set to whatever you wish.

@julian-hj
Copy link
Contributor

@akutz yeah sure, but we don't just want an error string. For the create call to be idempotent, it needs to return a full Result including VolumeInfo. It cannot do that and also return an Error because we use oneof

message CreateVolumeResponse {
  message Result {
    // Contains all attributes of the newly created volume that are
    // relevant to the CO along with information required by the Plugin
    // to uniquely identifying the volume. This field is REQUIRED.
    VolumeInfo volume_info = 1;
  }

  // One of the following fields MUST be specified.
  oneof reply {
    Result result = 1;
    Error error = 2;
  }
}

@akutz
Copy link
Contributor

akutz commented Sep 6, 2017

Hi @julian-hj,

Oh, I misunderstood you then. Yeah, I think you just restated the crux of the issue. Essentially the error codes negate idempotency. So either the error codes must be eliminated or moved into the result if you want the original value back.

@julian-hj
Copy link
Contributor

@akutz if we move to using grpc errors instead then this problem pretty much goes away. HTTP has the same pattern, in the sense that "idempotent" create calls might return different status codes on the first and second call (200 OK versus 201 CREATED)

@akutz
Copy link
Contributor

akutz commented Sep 6, 2017 via email

@saad-ali
Copy link
Member Author

saad-ali commented Sep 7, 2017

Yep, plan would would be to do this along with the move to gRPC error codes.

@jieyu jieyu added this to the v0.1 milestone Oct 2, 2017
@jdef
Copy link
Member

jdef commented Nov 1, 2017

#115

@saad-ali
Copy link
Member Author

Closed with #115

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants