-
Notifications
You must be signed in to change notification settings - Fork 9.1k
fix: removes string type for large numbers as it's the wrong level of abstraction #4588
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
… abstraction Signed-off-by: Vincent Biret <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
JSON does not provide for reliably transmitting integers outside of the range [-(253)+1, (253)-1], and does not provide any guarantee at all for reliably transmitting fixed point decimal. All of these need to continue to allow strings, otherwise interoperability is impossible.
@karenetheridge let me know if that rationale is enough for you and I can close this PR |
@handrews I don't see anything in https://ecma-international.org/wp-content/uploads/ECMA-404_2nd_edition_december_2017.pdf about restrictions for valid number ranges. JSON itself is just text. A number can be of any length. How a particular architecture chooses to represent that internally is up to it. Obviously a "native" number has size limitations, and different languages have different options for dealing with that. e.g. Rust has the BigInt type which can represent numbers larger than its native int32, int64 etc. An implementation can choose to represent its numbers internally however it likes; if it encodes to JSON that encoder would need to know how to handle any custom types, but this is no different from needing to understand how to encode/decode objects from a custom representation if there isn't a native object type in that language, etc. Remember that the document model we're using is at a layer of abstraction above the implementation itself. As long as the implementation is consistent and clear about how it represents the data types, and can encode/decode to JSON (or other interoperable data formats like YAML or CSV) consistently, then from the OpenAPI/JSON Schema perspective we're all good. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks @baywet!
@karenetheridge the interoperability limits are documented in RFC8259 §6 (emphasis added):
|
I'd also like to point out that we've been telling people to encode numbers as strings for better interoperability for at least as long as I've been around the project, and yanking that now seems likely to be confusing and frustrating for anyone who followed our advice. Which is advice you can find around the internet, particularly in the context of financial work where numeric interpretation interoperability is of critical importance. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
JavaScript-based implementations parse JSON numbers into binary64 numbers in memory, which cannot represent all values of int64 or decimal128 numbers.
Which is why these "long" numbers are represented as JSON strings.
This fact is documented here for the corresponding formats.
I would be fine with removing "number" as a "base type" because interoperable senders will always use strings on the wire.
Removing "string" would be confusing for consumers of OpenAPI descriptions because they will see these formats used as for example
monetaryAmount:
type: string
format: decimal
Maybe we should rename the field/column "JSON Data Type" to "JSON representation". As @karenetheridge pointed out JSON is a textual format for representing and exchanging data, and accepted practice for representing large numbers in JSON without loss of precision is to use JSON strings. |
@karenetheridge based on #4585 (comment) can I close this PR then? or do you have changes to suggest? |
After a good conversation with @hudlow I have realized I was wrong here (in my comments in #4585). To summarize, some languages like javascript have difficulty parsing JSON with very large integers into numbers (it would seem that there is no JSON decoder that knows how to use and parse values into custom types to represent these values). So for javascript users, the normal way they'd handle this is to represent the number as a string, and then parse it into an integer on the application side. And then we'd need something in the schema itself to indicate to the application that this value is intended to be a number -- and the obvious way of doing that is with the 'format' keyword, which could be used as either just an annotation, or an assertion as well. So, we would want the format to allow strings for these values. That doesn't mean that the numeric keywords ( Conclusion: allowing these formats to be I'd also suggest that all precision-based number formats be allowed to be strings (e.g. int64), in order to accomodate architectures with smaller integer sizes (e.g. 32 bit architectures that would have difficulty with an int64). tldr: I think this PR can be closed without merging, and perhaps we need an article on the learn site talking about handling numbers with some recommendations for various languages? |
@karenetheridge agreed on allowing all to be strings. This is important when working with serialization formats (XML and BTW, the correct JSON base type(s) are |
rationale #4585 (comment)