Skip to content

Should the spec suggest or recommend that validators only process json that validates against the metaschema? #93

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
Relequestual opened this issue Oct 13, 2016 · 21 comments

Comments

@Relequestual
Copy link
Member

As a result of a discussion at json-schema-org/JSON-Schema-Test-Suite#128 (comment), I ask if the spec should state that it is recomended that validation library confirms that the schema is valid against the metaschema before processing.
This would avlid issues like version missmatch and elements which may abigious, and the library has taken a speific stance on.

@iainbeeston
Copy link

iainbeeston commented Oct 13, 2016

My opinion is that the test suite should only include schemas that are valid, according to the draft metaschema. However, if the metaschema allows something (even if it's contradictory, like an {"type": "object"; "multipleOf": 5}) then there should be a test for it. I'd argue that we can already user validate schemas, by validating a user schema against the metaschema. So there is no need to handle invalid schemas in the test suite itself.

It would be nice if we could specify dependencies that depend on the value of a property (I don't think this is possible?). That would allow us to specify that if the value of the "type" property is "integer" then additonal properties are allowed, such as "multipleOf"

@awwright
Copy link
Member

Normatively, what constitutes a valid schema and what is invalid is defined by the specification, not anything else.

Ideally, though, the meta-schema we supply will always be valid against a valid JSON Schema.

Perhaps the test suite should include cases that test that invalid schemas raise an error condition. There's not too many of these, though.

@Relequestual
Copy link
Member Author

I think the simplest example was something like minimum. Although it's obvious that this applies only to numerical values, the metaschema doesn't specify the requirement. I think I've read somewhere that if the property doesn't apply, then it should just be ignored. My suggestion was that the metaschema COULD be more strict to, pick up on these types of invalid constructions.

I dislike the "it's wrong so ignore it" principal, as I feel it leads to unexpected behaviour, where a use believes the schema should do one thing, but it doesn't, and they don't know why.

@awwright
Copy link
Member

I'm not sure what you mean, can you provide an example?

"minimum" is never ignored (so to speak), it just doesn't return invalid for instances that aren't numbers (i.e. it returns valid for things when the instance is something besides a number). The schema is perfectly valid.

If you mean that the value for the "minimum" keyword must be a number, that's already a MUST level requirement, and is also reflected in the meta-schema.

@iainbeeston
Copy link

I'd assume, based on the validation spec, that if you use a property that has no meaning for the given type, then it should be ignored, and we should have tests for that.

For example, section 5.1 is entitled "Validation keywords for numeric instances (number and integer)". All of the rules under that section assume that the value will be a number, and unfortunately it doesn't say what should happen if it is not a number.

@awwright
Copy link
Member

awwright commented Oct 13, 2016

@iainbeeston In the absence of an explicit invalid result or error, an instance is valid.

Earlier in the document, it says

When the primitive type of the instance cannot be validated by a given keyword, validation for this keyword and instance SHOULD succeed.

This has all been rewritten for the next draft, with the same behavior.

@Relequestual
Copy link
Member Author

Right, that's what I'm talking about. I'm suggesting that it SHOULD NOT succeed, and an error is thrown that validation is trying to be applied to the wrong object type. If people dissagree with that, that's fine. It's just my thoughts on making the spec more strict and allowing people to pick up when they are doing something not quite right.

For example, having minimum on a string, I feel, should trigger an error. The metaschema chould formalise the relationship between properties and the types they apply to, by requiering that minimum is only valid with a numerical type. If that was defined in the metaschema, any library which validates a schema against the metaschema would trigger a validation error.

@awwright
Copy link
Member

@Relequestual Errors are only raised if there's something wrong with the schema itself.

A valid schema always produces a valid or invalid result against an instance; an invalid schema always produces an error against an instance, with an indeterminate result.

The reason that non-numbers are ignored for things like "minimum" is so you can have schemas that validate against multiple types of instances:

{ minimum:0, minLength:1, type:["string","number"] }

@Relequestual
Copy link
Member Author

OK. Hadn't considered that. In which case, that's a valid use case for saying no to my specific question regarding ignoring validation properties which are type specific.

@iainbeeston
Copy link

So, can I clarify what the outcome of this should be? Can we add more tests to define what should happen when a property (eg. multipleOf or minimum) is used for the wrong type?

@Relequestual
Copy link
Member Author

Relequestual commented Oct 13, 2016

@iainbeeston the correct behaviour is to ignore said property, as it's possible to have an array of possible types. This is as intended.

I guess it could be possible to write a metaschema such that if it's an array of types, that one of them must be a numerical type for property minimum to be allowed, for example, but it DOES create more complexity.

Given that there's a valid use case (array of possible types), I'm satisfied that it's not just an oversight, and as such, will close this issue.

@awwright
Copy link
Member

That's probably the biggest use case, but the real reason is actually because that (1) verifying the type is of the expected type, and (2) verifying (for example) a string has a minimum length, are two different tests that one might want to compare.

Or to put it another way, why should "minimum" invalidate all non-number instances, when the "type" keyword is perfectly capable of doing that instead?

@Relequestual
Copy link
Member Author

I was suggesting that the validator should fails the given schema based on the fact it doesn't validate against the metaschema. I was suggesting the validator shouldn't even start parsing the JSON till the Schema is considered valid.

Actually, that was what this issue was about, not the specific example. The spec should recommend that validation of the schema against the metaschema should happen before any attempt to validate the json against the schema.

@Relequestual Relequestual reopened this Oct 13, 2016
@awwright
Copy link
Member

To reiterate my point about that, the final authority as to which schemas are valid and which aren't, is the specification, not the meta-schema.

If we're going to change that, then we'd need to embed the meta-schema in the specification and that might get ugly.

And at some point, defining what the schema means does have to be done in prose, just like how some poor sod had to write the first compiler by punching holes in a card.

@Relequestual
Copy link
Member Author

Relequestual commented Oct 13, 2016

So what's the meta-schema for then?

I agree definitions have to be done in prose.

@awwright
Copy link
Member

The meta-schema identifies the version/vocabulary being used by a schema. And it should be totally possible to use JSON Schema to validate schemas, if we want to let this be a declarative replacement for procedurally verifying data.

@Relequestual
Copy link
Member Author

As I said, if I were writing a validation library, I'd want to fist validate the schema before I use it for validation. It feels like best practice, and after all, what json-schema actually enables you to do with other json documents. It feels hyprocrtical NOT to validate json before using it within a json validation library.

On reflection, a nice test in the test suite might be, can the libarary validate the metaschema against the metaschema... ;)

@awwright
Copy link
Member

My rationale for why I don't is: if there's not a problem that's going to interfere with validation of the instance, then why bother?

@handrews
Copy link
Contributor

handrews commented Oct 13, 2016

@Relequestual I want to emphasize how important the "applicability" feature (keywords can only fail the validation of instances to which they are applicable) is. Consider:

{
    "type": "object",
    "not": {
        "properties": {"foo": {}}
    }
}

If "properties" implied "object", then this would be an impossible schema because both the outer and inner schema would require the instance to be an object. However, this works because the type is only addressed outside of the "not", so there is no chance for the inner schema to behave differently based only on the instances type.

@Relequestual
Copy link
Member Author

Both valid points.
@handrews that makes a lot of sense now you've laid that out clearly for me. Thanks. I agree.
@awwright right, speed is an issue. Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants