Skip to content

Add ValidateVolumeCapabilities error code for volume too small #92

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

codenrhoden
Copy link

When using ValidateVolumeCapabilities for pre-provisioned volumes,
VolumeInfo is allowed to contain a capacity_bytes field. If the
pre-provisioned volume is smaller than this value, there was no
appropriate error code to return. This patch adds such a code.

While the spec only mentions needed to validate the VolumeCapabilities, e.g.

  // The information about the volume to check. This is a REQUIRED
  // field.
  VolumeInfo volume_info = 2;

  // The capabilities that the CO wants to check for the volume. This
  // call SHALL return “supported” only if all the volume capabilities
  // specified below are supported. This field is REQUIRED.
  repeated VolumeCapability volume_capabilities = 3;

this seems like an import thing to verify as well.

I was just going to raise the question as an Issue, but this took about the same amount of time.

When using ValidateVolumeCapabilities for pre-provisioned volumes,
VolumeInfo is allowed to contain a `capacity_bytes` field. If the
pre-provisioned volume is smaller than this value, there was no
appropriate error code to return. This patch adds such a code.

Signed-off-by: Travis Rhoden <[email protected]>
@codenrhoden
Copy link
Author

As I work with this a little bit more, I'm a bit confused on the correct way to return an error from ValidateVolumeCapabilities. The spec defines some specific error codes to use when items in the VolumeCapabilities are not met, namely:

      UNSUPPORTED_MOUNT_FLAGS = 2;
      UNSUPPORTED_VOLUME_TYPE = 3;
      UNSUPPORTED_FS_TYPE = 4;

However, this will return an error at the gRPC level, and the caller would not then be reading the supported boolean and optional description. So that makes me ask -- when a condition is found that the volume is not supported, is the intent to return a representative error code (like UNSUPPORTED_FS_TYPE), or is it that we should return supported=false? These two seem to be at odds with each other.

This may be my lack of gRPC experience speaking. Or maybe it is for the most common types of scenarios, error codes have been enumerated, but not others? I would think there is an error code for each thing that may not match in VolumeCapabilities, since these others exist. And if that's actually the case, then supported has no purpose, as it is implied by the lack of an error.

Is my understanding way off here?

@akutz
Copy link
Contributor

akutz commented Aug 24, 2017

Hi @codenrhoden,

I concur with your question above except for one thing:

However, this will return an error at the gRPC level

That's not technically true. It will return an error in the RPC Response message instead of a Result with supported=false. I agree that it doesn't make sense, but I also wanted to make sure we're aware that it isn't a gRPC error as we're not returning an error, but a valid RPC response with an error instead of a result.

@akutz
Copy link
Contributor

akutz commented Aug 24, 2017

Hi @codenrhoden,

I think perhaps this is the distinction. Imagine you receive a ValidateVolumeCapabilities request with an FsType of ext9. I'd return a UNSUPPORTED_FS_TYPE instead of supported=false because the request itself was invalid. Does that make sense?

@akutz
Copy link
Contributor

akutz commented Aug 24, 2017

Hi @codenrhoden,

I still share your concern. If my above attempt at answering the reasoning behind the distinction was not correct or sufficient, then I'd propose one of the two following compromises:

Solution 1

Refactor the ValidateVolumeCapabilitiesResponse like so:

message ValidateVolumeCapabilitiesResponse {
  message Result {}

  // One of the following fields MUST be specified.
  oneof reply {
    Result result = 1;
    Error error = 2;
  }
}

And then ensure that there is a valid error code for every possible reason that ValidateVolumeCapabilities might return supported=false.

Solution 2

Refactor ValidateVolumeCapabilitiesResponse to include an optional error code with the result so that the supported=false value can also be related to a ValidateVolumeCapabilitiesErrorCode value.

message ValidateVolumeCapabilitiesResponse {
  message Result {
    // True if the Plugin supports the specified capabilities for the
    // given volume. This field is REQUIRED.
    bool supported = 1;

    // Message to the CO if `supported` above is false. This field is
    // OPTIONAL.
    string message = 2;

    // An error code that can also indicate why `supported` 
    // above is false. This field is OPTIONAL.
    ValidateVolumeCapabilitiesErrorCode errorCode = 3;
  }

  // One of the following fields MUST be specified.
  oneof reply {
    Result result = 1;
    Error error = 2;
  }
}

@codenrhoden
Copy link
Author

@akutz, yeah, exactly. As I was actually implementing a ValidateVolumeCapabilities method, I found that when I was done, I had no place where I returned supported=false, because I was always returning specific errors instead.

So unless it's like you said, and you only return UNSUPPORTED if the request parameter itself doesn't make sense, then I think of the refactor options is needed.

@akutz
Copy link
Contributor

akutz commented Aug 24, 2017

Hi @codenrhoden,

I imagine it could come down to the same way libStorage executors would return supported=false. Imagine if you want to check if EBS is supported on a Node. If the Node isn't running on EC2 then it's an ErrorCode. However, if it is running on EC2 but the region doesn't match the volume being scheduled, it's false with a message about the region issue.

It's such a thin distinction that I think the supported and message fields are just there for occasions when there isn't an error code.

@codenrhoden
Copy link
Author

Related -

Now that AccessMode has been moved to VolumeCapability as of #82, there needs to be an error code for UNSUPPORTED_ACCESS_MODE.

I'll wait to hear from others before adding that, as I'd like to know whether to keep VOLUME_TOO_SMALL or not.

@jdef
Copy link
Member

jdef commented Aug 29, 2017

If implementations are always returning an error code when something is unsupported, then we can probably simplify the RPC response:

message ValidateVolumeCapabilitiesResponse {
  message Result {
    // intentionally empty: failed validation requests should generate an error
  }

  // One of the following fields MUST be specified.
  oneof reply {
    Result result = 1;
    Error error = 2;
  }
}

I know that @saad-ali has been thinking about refactoring error codes/handling. Maybe he has some insight here?

@jdef jdef requested a review from saad-ali August 29, 2017 17:33
@akutz
Copy link
Contributor

akutz commented Aug 29, 2017

Hi @jdef,

Isn't that the same as Solution 1 from this comment above? I'm in favor of it as long as we update the description of the UNKNOWN error code to indicate that it is valid and clients should see the associated error message. Currently this is what UNKNOWN says:

  // `ValidateVolumeCapabilities` specific error.
  message ValidateVolumeCapabilitiesError {
    enum ValidateVolumeCapabilitiesErrorCode {
      // Default value for backwards compatibility. SHOULD NOT be
      // returned by Plugins. However, if a Plugin returns a
      // `ValidateVolumeCapabilitiesErrorCode` code that an older CSI
      // client is not aware of, the client will see this code (the
      // default fallback).
      //
      // Recovery behavior: Caller SHOULD consider updating CSI client
      // to match Plugin CSI version.
      UNKNOWN = 0;

Alternatively we could add a new error code named MESSAGE that explicitly indicates to see the error's associated string data.

@jdef
Copy link
Member

jdef commented Aug 29, 2017 via email

@akutz
Copy link
Contributor

akutz commented Aug 29, 2017

Hi @jdef,

I'm suggesting that if we get rid of any of the data in the Result message (for which I'm in favor), that there will still be cases where there is an error not properly described by an existing error code. I'm proposing that unless you want UNKNOWN to also mean please see the error message then maybe we should create a new error code MESSAGE that explicitly means there is no existing error code for this error; please see the associated message.

Because not all errors will be described by codes in advance. One of the nice things about the Result as it is is the free-form reason for the reason supported is false. If we replace those with simply error codes, I'm saying one error code should basically mean "Please see the error message."

@jdef
Copy link
Member

jdef commented Aug 29, 2017

Thanks for the clarification @akutz. I'm going to wait for @saad-ali to chime in here because he's got an action item to review the current error code implementation and the results of his investigation/proposed-changes would likely directly impact this case.

@akutz
Copy link
Contributor

akutz commented Aug 29, 2017

FWIW the current implementation of errors is fine IMO, and there is already code in the wild to parse them. I think any RPC with what is essentially a boolean response will be subject to this same discussion -- should an error be used to represent the false value? If yes then I think the solution must allow for an error code that indicates a heretofore unknown/not-yet-considered reason for the negative result.

@jdef
Copy link
Member

jdef commented Nov 14, 2017

should probably be re-cast now that #115 has landed, perhaps as a case where OutOfRange might be returned?

@jdef
Copy link
Member

jdef commented Feb 26, 2018

VolumeInfo / capacity is no longer an input to the ValidateVolumeCapabilities RPC, so this PR no longer applies; closing it out. Please re-open if you believe this has been closed in error.

@jdef jdef closed this Feb 26, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants