Skip to content

Add "contentSchema" #673

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Nov 27, 2018
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
76 changes: 66 additions & 10 deletions jsonschema-validation.xml
Original file line number Diff line number Diff line change
Expand Up @@ -726,7 +726,7 @@
</section>
</section>

<section title='String-Encoding Non-JSON Data' anchor="content">
<section title='String-Encoded Data' anchor="content">

<section title="Foreword">
<t>
Expand Down Expand Up @@ -773,6 +773,12 @@
can be constrained using the <xref target="pattern">"pattern"</xref> keyword.
</t>

<t>
If this keyword is absent, but "contentMediaType" is present, this
indicates that the media type could be encoded into UTF-8 like any
other JSON string value, and does not require additional decoding.
</t>

<t>
The value of this property MUST be a string.
</t>
Expand All @@ -786,25 +792,33 @@

<section title="contentMediaType">
<t>
The value of this property must be a media type, as defined by
<xref target="RFC2046">RFC 2046</xref>. This property defines the media
type of instances which this schema defines.
If the instance is a string, this property defines the media type
of the contents of the string. If "contentEncoding" is present,
this property describes the decoded string.
</t>

<t>
The value of this property MUST be a string.
The value of this property MUST be a string, which MUST be a media type,
as defined by <xref target="RFC2046">RFC 2046</xref>.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are the expectations on implementations here? There are a LOT of media types. Are we expected to be able to identify (if not validate) them all?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See the Implementation Requirements section. The short version is that you aren't expected to be ably to identify or validate any of them. The slightly longer version is that it's basically like format, except we are not picking a default set of recognizable values.

I kind of expect most implementations to handle these as annotations passed on to the application, maybe with a callback mechanism to automatically validate the decoded text against contentSchema.

Since this is not changed in this PR (I just moved the phrase "media type" from the first paragraph to the second), I'm not going to hold up this PR on it. Although if you think there is something specific to contentSchema (as opposed to contentMediaType) that needs clarification I would be happy to work on that a bit more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm... I could see that people might expect contentSchema to always be implemented, without really thinking about the dependency on having the right plugin (or whatever) for the underlying contentMediaType and contentEncoding values. Is that your concern, @gregsdennis ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My concern is that the "MUST" implies that implementations are required to be able to validate that a given value is a media type as defined by RFC 2046.

If it's like format, then maybe similar language can be used that allows implementations to defer that validation to the application in the form of returning the value as an annotation, as you said.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gregsdennis the "Implementation Requirements" section is practically copy-pasted from the format section. What is missing? I don't want to repeat them for each keyword, if you read the spec outside of the diff lines I think it's already clear. Can you point to specific problems in that section?

Also, I don't hear anything specific to contentSchema here (you're talking about moved text, not new requirements) so please file it as a new issue. This PR is just for contentSchema.

</t>

<t>
The value of this property SHOULD be ignored if the instance described is not a
string.
</t>
</section>

<section title="contentSchema">
<t>
If the "contentEncoding" property is not present, but the instance value is a
string, then the value of this property SHOULD specify a text document type,
and the character set SHOULD be the character set into which the JSON string
value was decoded (for which the default is Unicode).
If the instance is a string, and if "contentMediaType" is present, this
property contains a schema which describes the structure of the string.
</t>
<t>
This keyword MAY be used with any media type that can be mapped into
JSON Schema's data model.
</t>
<t>
The value of this property SHOULD be ignored if the instance described is not a
string, or if "contentMediaType" is not present.
</t>
</section>

Expand Down Expand Up @@ -847,6 +861,48 @@
Unicode).
</postamble>
</figure>

<figure>
<preamble>
This example describes a JWT that is MACed using the HMAC SHA-256
algorithm, and requires the "iss" and "exp" fields in its claim set.
</preamble>
<artwork>
<![CDATA[
{
"type": "string",
"contentMediaType": "application/jwt",
"contentSchema": {
"type": "array",
"minItems": 2,
"items": [
{
"const": {
"typ": "JWT",
"alg": "HS256"
}
},
{
"type": "object",
"required": ["iss", "exp"],
"properties": {
"iss": {"type": "string"},
"exp": {"type": "integer"}
}
}
]
}
}]]>
</artwork>
<postamble>
Note that "contentEncoding" does not appear. While the "application/jwt"
media type makes use of base64url encoding, that is defined by the media
type, which determines how the JWT string is decoded into a list of two
JSON data structures: first the header, and then the payload. Since the
JWT media type ensures that the JWT can be represented in a JSON string,
there is no need for further encoding or decoding.
</postamble>
</figure>
</section>

</section>
Expand Down