-
-
Notifications
You must be signed in to change notification settings - Fork 590
Add a helper for extending a validator to reject unknown formats #252
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Hey! Thanks for the report. I'm generally not in favor of Also, this can already be done -- use |
Hi Julian! Thanks for your input. Not sure I understand your reasoning, but it's early morning here so that might be why 😉 The reason I opened this issue is because I have a REST API where users push in data, according to a specific JSON schema. I want them to get a 400 Bad Request if I encounter data that I will not be able to parse, like a date not formatted according to ISO 8601. If this can already be done I would love to see how. Thanks man! |
Your users are sending you schemas to use or instances to validate? It sounds like you're saying instances? |
Yup, they are sending instances for me to validate. |
Hm, OK, then I'm confused :), how does that relate to this ticket? |
That's because I was confused... Ignore my previous messages. What I'm looking for as I stated originally is a way to become aware of formats that will not be validated. As a developer it's hard to notice if jsonschema just ignores the In [1]: import jsonschema
In [2]: print(jsonschema.validate("xy", {"type": "string", "format" : "time-hour"}, format_checker=jsonschema.FormatChecker()))
None How would I catch this? As an author of the schemas I want to be aware of which format checks I need to create. |
I see, OK, so now possibly my first comment might make sense: The reason I don't generally think strict modes are useful is (among other things), as a developer, how would you ever know that you need strict mode unless you already knew that your instance contained formats that didn't exist? It sounds like for you that you just want to do this once possibly? Or once when you get a new schema, to see if there's any formats that someone put in there that you need to implement? You can extend a metaschema to define that formats other than the known ones are invalid. E.g.:
Might be useful to have that as a utility you can compose onto a validator, but needs a bit of cleaning up to handle the general case, you'd have to do it a bit different to use the |
+1 for having some way of handling this, though I have no opinion about the best way to plumb it through. I was getting silent failures with no indication that anything was wrong. This can easily happen due to a typo, for example. Since schemas are used to certify that data is valid, I think it's problematic to pass through an unknown format and say "everything is OK". I personally would favor making "failure on unknown format" the default, but I'm guessing that would break existing use cases. If we can't make it default, I think having a way to turn on a "strict mode" would be very useful. |
(a more official way that is) |
Not to mention completely violate the specification. The simplest way to handle this during development is to use a locally meta-schema that puts an |
I say this quite a bit, but schemas are not a replacement for unit tests :) -- so "silent" here is "silent" in the same sense as if you wrote buggy code and didn't notice until your test failed. It won't be the default, for the reason @handrews mentioned, this library is an implementation of the specification, but I'm still happy to accept a helper function for adding an enum restriction to a metaschema. Though increasingly I'm thinking there'll shortly be a |
Yes, the spec does say an implementation should (though not shall) succeed if it sees an unknown format (http://json-schema.org/latest/json-schema-validation.html#rfc.section.7.2). However, it says nothing about providing an option to behave differently. Regardless, I think specs should come from use cases, as their point is to standardize systems for solving problems. I think my use case here is legitimate, although I can also understand those who want to silently pass through an unknown format. My point is that providing the option to do both is useful. If the spec forbids making my way the default, no problem, but having a way to enforce the constraint is important. I would love to be able to set a property like @handrews, if I use a meta-schema, can I also validate against the main jsonschema spec? AFAIK, you can't validate against multiple meta-schemas. That said, uses a meta-schema doesn't seem like a very "obvious" / convenient way of handling it, so it's not something most people would think of. Having an easier way would be helpful.
Sure, schemas are not a replacement for unit tests, just as writing good, defensive code is not a replacement for unit tests. But, schemas are used for enforcing constraints, and in this case, I have a constraint to enforce, so ideally there should be a way to express that in the schema language.
What method do you propose? If you sketch out the gist of the design, I'd be happy to code it up if I have the time. |
@martingkelly Adding this line to your meta-schema will include the base meta-schema: allOf: [ { $ref: "http://json-schema.org/draft-06/schema#" } ] |
Ah, I hadn't thought of using |
That said, I'm not sure how to enforce the
This is expressed in YAML, but it parses the same. |
Option, sure. Default, absolutely not.
The primary focus of draft-08 is improving extensibility. Look at issues tagged with "vocabulary" in the spec repo. It's a huge topic and I've been working on it for months. Still lots to work out.
Because it's a gigantic pain right now. Look at how the hyper-schema meta-schema currently handles it (it has to re-define a lot of things). Search the spec repo issues for
The next draft should define how extensibility works. While the exact programmatic interface for extensions will, of course, be language- and implementation-specific, the overall structure and capabilities will be defined. Again, I'm being vague because this topic is enormous and an active area of work right now. |
The spec even has a note on how hard it is. I'm so looking forward to removing that note and replacing it with an actual solution in draft-08. |
OK, I'm glad to hear this is being worked on. Can you expand on exactly what was done to make the hyper-schema work? I tried to add |
Copy the hyper-schema meta-schema. Change
As someone who has been maintaining a spec for a couple of years, I can tell you that pretty much everyone who has ever had their request declined uses this argument to mean "my use case is the use case you should solve", regardless of its impact on everyone else, or whether it fits the overall philosophy of the system. |
Is there a way to do this without having to copy the hyper-schema? I was hoping to use
Of course you can't and shouldn't try to please everyone. But, to the best of my knowledge, there is no similar spec to json schema, except if I were to use XML (which I don't want to do). So if I say "OK, I guess I need to find a spec that supports my use case", I'm left with inventing my own thing and duplicating the 90% of stuff that I do want from json schema. I would say that's a lose-lose for everyone. There's always a balance between trying to please everyone and being too inflexible, and it's always a judgment call. |
That is the whole point of the warning in the spec that I linked to, and the motivation for As you can see from the fact that
I'm reasonably confident that the current balance on media types (JSON Schema is defined over a data model rather than text, but otherwise does not require or constrain how implementations handle different media types) is in a good place. These things never get universal agreement, but there is a good level of flexibility in the current approach. |
I'm going to close this since I believe as of draft2019+ (which should come in the next release) that this is possible to do purely within JSON Schema (the spec) using If that's not the case, happy to revisit (and if someone does write said schema, feel free to share it). |
It would be nice if a strict mode could be turned on for
FormatChecker
. By strict I mean thatjsonschema
will raise an error if it encounters a format that it doesn't recognize.The text was updated successfully, but these errors were encountered: