Should the spec suggest or recommend that validators only process json that validates against the metaschema? #93

Relequestual · 2016-10-13T09:19:03Z

As a result of a discussion at json-schema-org/JSON-Schema-Test-Suite#128 (comment), I ask if the spec should state that it is recomended that validation library confirms that the schema is valid against the metaschema before processing.
This would avlid issues like version missmatch and elements which may abigious, and the library has taken a speific stance on.

iainbeeston · 2016-10-13T09:46:45Z

My opinion is that the test suite should only include schemas that are valid, according to the draft metaschema. However, if the metaschema allows something (even if it's contradictory, like an {"type": "object"; "multipleOf": 5}) then there should be a test for it. I'd argue that we can already user validate schemas, by validating a user schema against the metaschema. So there is no need to handle invalid schemas in the test suite itself.

It would be nice if we could specify dependencies that depend on the value of a property (I don't think this is possible?). That would allow us to specify that if the value of the "type" property is "integer" then additonal properties are allowed, such as "multipleOf"

awwright · 2016-10-13T09:49:05Z

Normatively, what constitutes a valid schema and what is invalid is defined by the specification, not anything else.

Ideally, though, the meta-schema we supply will always be valid against a valid JSON Schema.

Perhaps the test suite should include cases that test that invalid schemas raise an error condition. There's not too many of these, though.

Relequestual · 2016-10-13T10:05:39Z

I think the simplest example was something like minimum. Although it's obvious that this applies only to numerical values, the metaschema doesn't specify the requirement. I think I've read somewhere that if the property doesn't apply, then it should just be ignored. My suggestion was that the metaschema COULD be more strict to, pick up on these types of invalid constructions.

I dislike the "it's wrong so ignore it" principal, as I feel it leads to unexpected behaviour, where a use believes the schema should do one thing, but it doesn't, and they don't know why.

awwright · 2016-10-13T10:09:15Z

I'm not sure what you mean, can you provide an example?

"minimum" is never ignored (so to speak), it just doesn't return invalid for instances that aren't numbers (i.e. it returns valid for things when the instance is something besides a number). The schema is perfectly valid.

If you mean that the value for the "minimum" keyword must be a number, that's already a MUST level requirement, and is also reflected in the meta-schema.

iainbeeston · 2016-10-13T10:24:26Z

I'd assume, based on the validation spec, that if you use a property that has no meaning for the given type, then it should be ignored, and we should have tests for that.

For example, section 5.1 is entitled "Validation keywords for numeric instances (number and integer)". All of the rules under that section assume that the value will be a number, and unfortunately it doesn't say what should happen if it is not a number.

awwright · 2016-10-13T10:28:55Z

@iainbeeston In the absence of an explicit invalid result or error, an instance is valid.

Earlier in the document, it says

When the primitive type of the instance cannot be validated by a given keyword, validation for this keyword and instance SHOULD succeed.

This has all been rewritten for the next draft, with the same behavior.

Relequestual · 2016-10-13T10:36:44Z

Right, that's what I'm talking about. I'm suggesting that it SHOULD NOT succeed, and an error is thrown that validation is trying to be applied to the wrong object type. If people dissagree with that, that's fine. It's just my thoughts on making the spec more strict and allowing people to pick up when they are doing something not quite right.

For example, having minimum on a string, I feel, should trigger an error. The metaschema chould formalise the relationship between properties and the types they apply to, by requiering that minimum is only valid with a numerical type. If that was defined in the metaschema, any library which validates a schema against the metaschema would trigger a validation error.

awwright · 2016-10-13T10:53:57Z

@Relequestual Errors are only raised if there's something wrong with the schema itself.

A valid schema always produces a valid or invalid result against an instance; an invalid schema always produces an error against an instance, with an indeterminate result.

The reason that non-numbers are ignored for things like "minimum" is so you can have schemas that validate against multiple types of instances:

{ minimum:0, minLength:1, type:["string","number"] }

Relequestual · 2016-10-13T10:56:34Z

OK. Hadn't considered that. In which case, that's a valid use case for saying no to my specific question regarding ignoring validation properties which are type specific.

iainbeeston · 2016-10-13T11:16:42Z

So, can I clarify what the outcome of this should be? Can we add more tests to define what should happen when a property (eg. multipleOf or minimum) is used for the wrong type?

awwright · 2016-10-13T11:21:37Z

Most of the tests should already verify the correct behavior, e.g.:

If we're missing one, then yes, let's add a test case.

Relequestual · 2016-10-13T11:35:08Z

@iainbeeston the correct behaviour is to ignore said property, as it's possible to have an array of possible types. This is as intended.

I guess it could be possible to write a metaschema such that if it's an array of types, that one of them must be a numerical type for property minimum to be allowed, for example, but it DOES create more complexity.

Given that there's a valid use case (array of possible types), I'm satisfied that it's not just an oversight, and as such, will close this issue.

awwright · 2016-10-13T11:38:57Z

That's probably the biggest use case, but the real reason is actually because that (1) verifying the type is of the expected type, and (2) verifying (for example) a string has a minimum length, are two different tests that one might want to compare.

Or to put it another way, why should "minimum" invalidate all non-number instances, when the "type" keyword is perfectly capable of doing that instead?

Relequestual · 2016-10-13T11:43:58Z

I was suggesting that the validator should fails the given schema based on the fact it doesn't validate against the metaschema. I was suggesting the validator shouldn't even start parsing the JSON till the Schema is considered valid.

Actually, that was what this issue was about, not the specific example. The spec should recommend that validation of the schema against the metaschema should happen before any attempt to validate the json against the schema.

awwright · 2016-10-13T11:55:17Z

To reiterate my point about that, the final authority as to which schemas are valid and which aren't, is the specification, not the meta-schema.

If we're going to change that, then we'd need to embed the meta-schema in the specification and that might get ugly.

And at some point, defining what the schema means does have to be done in prose, just like how some poor sod had to write the first compiler by punching holes in a card.

Relequestual · 2016-10-13T11:58:26Z

So what's the meta-schema for then?

I agree definitions have to be done in prose.

awwright · 2016-10-13T12:03:44Z

The meta-schema identifies the version/vocabulary being used by a schema. And it should be totally possible to use JSON Schema to validate schemas, if we want to let this be a declarative replacement for procedurally verifying data.

Relequestual · 2016-10-13T12:30:15Z

As I said, if I were writing a validation library, I'd want to fist validate the schema before I use it for validation. It feels like best practice, and after all, what json-schema actually enables you to do with other json documents. It feels hyprocrtical NOT to validate json before using it within a json validation library.

On reflection, a nice test in the test suite might be, can the libarary validate the metaschema against the metaschema... ;)

awwright · 2016-10-13T13:47:52Z

My rationale for why I don't is: if there's not a problem that's going to interfere with validation of the instance, then why bother?

handrews · 2016-10-13T14:47:21Z

@Relequestual I want to emphasize how important the "applicability" feature (keywords can only fail the validation of instances to which they are applicable) is. Consider:

{
    "type": "object",
    "not": {
        "properties": {"foo": {}}
    }
}

If "properties" implied "object", then this would be an impossible schema because both the outer and inner schema would require the instance to be an object. However, this works because the type is only addressed outside of the "not", so there is no chance for the inner schema to behave differently based only on the instances type.

Relequestual · 2016-10-14T09:21:05Z

Both valid points.
@handrews that makes a lot of sense now you've laid that out clearly for me. Thanks. I agree.
@awwright right, speed is an issue. Thanks.

Relequestual closed this as completed Oct 13, 2016

Relequestual reopened this Oct 13, 2016

Relequestual closed this as completed Oct 14, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Should the spec suggest or recommend that validators only process json that validates against the metaschema? #93

Should the spec suggest or recommend that validators only process json that validates against the metaschema? #93

Relequestual commented Oct 13, 2016

iainbeeston commented Oct 13, 2016 •

edited

Loading

awwright commented Oct 13, 2016

Relequestual commented Oct 13, 2016

awwright commented Oct 13, 2016

iainbeeston commented Oct 13, 2016

awwright commented Oct 13, 2016 •

edited

Loading

Relequestual commented Oct 13, 2016

awwright commented Oct 13, 2016

Relequestual commented Oct 13, 2016

iainbeeston commented Oct 13, 2016

awwright commented Oct 13, 2016 •

edited

Loading

Relequestual commented Oct 13, 2016 •

edited

Loading

awwright commented Oct 13, 2016

Relequestual commented Oct 13, 2016

awwright commented Oct 13, 2016

Relequestual commented Oct 13, 2016 •

edited

Loading

awwright commented Oct 13, 2016

Relequestual commented Oct 13, 2016

awwright commented Oct 13, 2016

handrews commented Oct 13, 2016 •

edited

Loading

Relequestual commented Oct 14, 2016

Should the spec suggest or recommend that validators only process json that validates against the metaschema? #93

Should the spec suggest or recommend that validators only process json that validates against the metaschema? #93

Comments

Relequestual commented Oct 13, 2016

iainbeeston commented Oct 13, 2016 • edited Loading

awwright commented Oct 13, 2016

Relequestual commented Oct 13, 2016

awwright commented Oct 13, 2016

iainbeeston commented Oct 13, 2016

awwright commented Oct 13, 2016 • edited Loading

Relequestual commented Oct 13, 2016

awwright commented Oct 13, 2016

Relequestual commented Oct 13, 2016

iainbeeston commented Oct 13, 2016

awwright commented Oct 13, 2016 • edited Loading

Relequestual commented Oct 13, 2016 • edited Loading

awwright commented Oct 13, 2016

Relequestual commented Oct 13, 2016

awwright commented Oct 13, 2016

Relequestual commented Oct 13, 2016 • edited Loading

awwright commented Oct 13, 2016

Relequestual commented Oct 13, 2016

awwright commented Oct 13, 2016

handrews commented Oct 13, 2016 • edited Loading

Relequestual commented Oct 14, 2016

iainbeeston commented Oct 13, 2016 •

edited

Loading

awwright commented Oct 13, 2016 •

edited

Loading

awwright commented Oct 13, 2016 •

edited

Loading

Relequestual commented Oct 13, 2016 •

edited

Loading

Relequestual commented Oct 13, 2016 •

edited

Loading

handrews commented Oct 13, 2016 •

edited

Loading