Skip to content

format: byte also defaults to octet-stream (3.0.4) #3922

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jun 23, 2024
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions versions/3.0.4.md
Original file line number Diff line number Diff line change
Expand Up @@ -1717,8 +1717,8 @@ The default values for `contentType` are as follows, where an _n/a_ in the `form

Property `type` | Property `format` | Default `contentType`
------------- | --------------- | ---------------------
`string` | `binary` | `application/octet-stream`
`string` | _none, or any except `binary`_ | `text/plain`
`string` | `binary` _or_ `byte` | `application/octet-stream`
`string` | _none, or any except `binary` or `byte`_ | `text/plain`
Comment on lines +1720 to +1721
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So binary and byte are synonyms for serializing parts of a multipart request body? Any guidance on which is the preferred synonym?

Also: why does encoding and the Encoding Object not apply to multipart response bodies?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am uncomfortable with this change. In general, I think format: byte means "base64 encoded" and the purpose of base64 encoding is to make the value safe for a string field.

Also, the "considerations for multipart" section in 3.0.3 does not actually mention format: byte -- it mentions format: base64, which not only is absent from the "formats defined by the OAS" table in the "Data Types" section but is not even listed in the Formats registry.

My conclusion here is that the "considerations for multipart" section is actually the place to be corrected by removing format: base64.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: the format: base64 typo in 3.0.3 has been corrected in 3.0.4 via #3182.

Not mentioning format: byte here (by closing this PR without merging) would make me feel more comfortable.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ralfhandl

So binary and byte are synonyms for serializing parts of a multipart request body? Any guidance on which is the preferred synonym?

binary and byte are not synonyms, and are explained in he "Data Types" section, particularly in the "Working with Binary Data" subsection.

Also: why does encoding and the Encoding Object not apply to multipart response bodies?

It applies to multipart and application/x-www-form-urlencoded, but what does that have to do with this PR?

@mikekistler

I am uncomfortable with this change. In general, I think format: byte means "base64 encoded" and the purpose of base64 encoding is to make the value safe for a string field.

It's not a change. I just made a mistake and forgot to include it and no one caught it because there had been a mistake in another part of the text which made it look like only binary is correct But that was wrong.

The reason is that, in multipart, Content-Type refers to the actual content type of the data part. The encoding that makes it safe for a string is placed in Content-Transfer-Type. This should be a bit more clear after PR #3923. So yes, it is made safe for a string, but no, it does not change the Content-Type header, and therefore does not change the default for contentType compared to format: binary.

The spec has always stated that format: byte defaults application/octet-stream, and in 3.1 it states even more clearly that any contentEncoding (including but not limited to base64) defaults to application/octet-stream. So this is not a change.

Also, the "considerations for multipart" section in 3.0.3 does not actually mention format: byte -- it mentions format: base64, which not only is absent from the "formats defined by the OAS" table in the "Data Types" section but is not even listed in the Formats registry.

As @ralfhandl mentioned, base64 was always an error and had been fixed - if you look in PR #3923 you'll see it was already format: byte in 3.0.4. I know we discussed whether to add base64 to the registry but I guess we decided not to, and to emphasize that byte was correct. I can't find the discussion right now. Either way, whether base64 is in the registry is irrelevant to this PR.

My conclusion here is that the "considerations for multipart" section is actually the place to be corrected by removing format: base64.

goes with:

@ralfhandl

Not mentioning format: byte here (by closing this PR without merging) would make me feel more comfortable.

That would make the spec incorrect. It's inconsistent with the text from the Media Type object (which is more complete and correct in both 3.0 and 3.1) and is inconsistent with how the analogous contentEncoding is used in 3.1.

This PR is just a bug fix.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mikekistler in response to your question in the TDC call, the Content-Transfer-Encoding requirement is in the last paragraph of this section (or, more accurately, it will be after PR #3923 is merged - that rendering includes both that PR and this one).

And yes, I want to acknowledge that 3.0.3 was not at all clear about Content-Transfer-Encoding, and the inconsistent use of byte and base64 made things worse. (byte was always what was defined in the "Data Types" section, but 3.0.3 and earlier mistakenly used base64 in some examples - we debated it a lot b/c obviously base64 is what would be used anywhere else, but we ultimately agreed byte was what had been intended and is what should be used.)

`number`, `integer`, or `boolean` | _n/a_ | `text/plain`
`object` | _n/a_ | `application/json`
`array` | _n/a_ | according to the `type` and `format` of the `items` schema
Expand Down