Skip to content
This repository was archived by the owner on Mar 19, 2019. It is now read-only.

v5: "ban unknown properties" alternative #151

Closed
orbisvicis opened this issue Jan 21, 2015 · 12 comments
Closed

v5: "ban unknown properties" alternative #151

orbisvicis opened this issue Jan 21, 2015 · 12 comments

Comments

@orbisvicis
Copy link

schema =\
    { "definitions":
        { "properties":
            { "p1":
                { "type": "string"
                }
            , "p2":
                { "type": "integer"
                }
            }
        }
    , "title": "JSON-Schema v5 Proposal"
    , "description": "Resolve 'additionalProperties' and combiners by extending '$ref`"
    , "properties":
        { "childp1":
            { "type": "string"
            }
        , "extend":
            { "$ref": "#/definitions/properties"
            , "extend": true
            , "description": "Multiple ideas: 1) '$ref' attached via 'extend' parent keyword 2) '#ref' attach via anything, with '"extend": True' keyword"
            }
        }
    , "required": [ "p1", "p2", "childp1" ]
    , "additionalProperties": false
    }

Problems?

@fge
Copy link

fge commented Jan 21, 2015

No, that can't work. This proposal has the same problem as how required was defined in draft v3: the value of object members of a properties value are supposed to be full fledged schemas; which means this extend keyword has no meaning when not attached to a parent schema.

@orbisvicis
Copy link
Author

I think we're not on the same page. To be clear, manually inlining all the references should yield a valid schema:

schema =\
    { "title": "JSON-Schema v5 Proposal"
    , "description": "Resolve 'additionalProperties' and combiners by extending '$ref`"
    , "properties":
        { "childp1":
            { "type": "string"
            }
        , "p1":
            { "type": "string"
            }
        , "p2":
            { "type": "integer"
            }
        }
    , "required": [ "p1", "p2", "childp1" ]
    , "additionalProperties": false
    }

Of the two approaches, are you saying that only the first can't work?

  1. $ref attached via extend parent keyword
  2. $ref attach via anything, with "extend": True' keyword

I don't see why the second can't. Technically, { "$ref": "#/..." } is not a valid schema, but a reference to one which modifies the parser's behaviour. Why can't we have { "$ref": "#/...", "extend": true } which modifies it even more?

@fge
Copy link

fge commented Jan 21, 2015

Why can't we have { "$ref": "#/...", "extend": true } which modifies it even more?

Because by defintion a JSON Object with a member whose name is "$ref" is a JSON Reference; and when you encountered a JSON Reference, again by definition, all other members of this JSON Object SHALL be ignored. That's why...

Your "extend" would be ignored. And then it suffers the same problem as "required" in draft v3, only worse!

Also, please recall that JSON Schema cannot treat JSON Objects in any privileged manner. It has equal treatment for any valid JSON value which, since RFC 7159, means any JSON text as well.

@orbisvicis
Copy link
Author

Also, please recall that JSON Schema cannot treat JSON Objects in any privileged manner.

I'm not really sure what this means. I assume it means my first approach is also invalid:

  1. $ref attached via extend parent keyword

Doesn't JSON Schema treat $ref and $data in a privileged manner? Furthermore $data is specific to JSON Schema. If so, why not create another privileged keyword:

    { "$extendref: "a json pointer" }

Anyway it's a shame that the JSON Reference specification explicitly forbids any such extensions. I also had a third suggestion:

{ "$ref": "#/definitions/key/*"}

where the "/" symbolizes "extend": true. I believe the JSON Pointer specification requires interpreting "" as a named-element to the path, making it invalid. It would great if you could confirm. Besides, this sort of xpath-style behaviour could be its own specification.

@fge
Copy link

fge commented Jan 21, 2015

Doesn't JSON Schema treat $ref and $data in a privileged manner?

$ref is out of JSON Schema's control; it is specified by another IETF draft. And the draft says that if a member is encountered by that name in a JSON Object, then any other members are ignored.

Which, imnsho, is not a bad thing.

I also had a third suggestion:

{ "$ref": "#/definitions/key/*"}

where the "/*" symbolizes "extend": true

Can't work... Remind that "*" is a valid JSON Object member name. And so is the empty string for that matter!

As I said, I didn't propose $merge and $patch out of the blue. I am fully aware of all the constraints on JSON Schema, of all the keywords and of all the shortcomings; those two keywords, in addition to strictProperties which is a keyword dedicated to JSON Objects, allow you full control over modifying your schemas on the fly.

@orbisvicis
Copy link
Author

Ok, this is good news!

I know that you (still) prefer your proposal, but in your nsho fully aware knowledge of JSON and JSON Schema, creating a new keyword is technically possible, right?

{ "$extendref: "a json pointer" }

@fge
Copy link

fge commented Jan 21, 2015

Well, what you propose is basically schema merging, not plain JSON merging.

The idea was touted at some point, but no clean solution could ever be achieved :/

Maybe the community didn't try hard enough at the time; but still, I do believe it is preferrable to specify mechanisms which are guaranteed to work (since standards are there to attest for those) and leave the "fine points" to implementations.

Ultimately, what are we talking about here? Internal implementation. The points we are discussing, the user will never see any of those. Whether that form he fills, which is described by a JSON Schema, is stored on the server in one big document-oriented database, or in 3 tables in an RDBMS, or even in a text file, is not the user's problem, but the implementors; and in a similar manner, an implementation of JSON Schema has no need to be aware of storage problems! JSON Schema does not describe storage.

In this sense, I believe that any keyword which adds semantic meaning to JSON Schema, which is what your proposal does, is a mistake. With $merge and $patch, you can instruct an implementation to just patch schemas, and it will be done, now whether that patching has an actual semantic meaning to you is, and should be, irrelevent to JSON Schema implementations!

@orbisvicis
Copy link
Author

(This is a continuation from json-schema/issues#193)

Regarding $merge, $patch

Given two schemas A and B both of which define required, json-merge-patch is unable to automatically create a schema C which merges the required values from both.

Neither is JSON Patch. But here you talk about two independent schemas; this is not the purpose of either $merge or $patch; their intent is to apply transformations to an existing schema; nothing says that the patch to apply is a JSON Schema at all (in fact, in the $patch case they cannot be; a JSON Patch is an array).
If you have two schemas to obey, you should validate against them independently!

Even single-inheritance cases suffer this problem. I've drawn the following examples from your proposal, modifying them slightly:

{ "merge":
    { "schema":
        { "type": "object"
        , "properties":
            { "p": { "type": "integer" }
            }
        , "required": [ "p" ]
        }
    , "with":
        { "properties":
            { "q": { "type": "array" }
            }
        , "required": [ "q" ]
        }
    }
}

The actually result:

{ "type": "object"
, "properties":
    { "p": { "type": "integer" }
    , "q": { "type": "array" }
    }
, "required": [ "q" ]
} 

The expected results:

{ "type": "object"
, "properties":
    { "p": { "type": "integer" }
    , "q": { "type": "array" }
    }
, "required": [ "p" "q" ]
} 

But like I said, json-patch is able to attain the correct behaviour:

{ "op": "replace", "path": "/required", "value": [ <contains merged requires, created by json-schema> ] },

And like you said, this is no longer semantically pure.

Regarding $extendref

Well, what you propose is basically schema merging, not plain JSON merging.

Not at all. The above result occurs because json-merge-patch expresses semantic-agnostic (or context-agnostic) behaviour. Because json-merge-patch is unrelated json-schema, to maintain purity of operation, the json-merge-patch behaviour cannot be modified. This is a good thing.

The problem is the assumption that working alternative (ie, produce the expected result) must therefore be semantically or context aware. This is not true. For example, my proposal expresses a freer type of json merging. It is also context-agnostic like json-merge-patch, but because it provides greater control over the location of the splice points, it allows the user to specify semantically-correct behaviour. Examples:

Merge anything, not just properties or schemas:

{ "$schema": "http://json-schema.org/draft-04/schema#"
, "title": "Merge Anything"
, "definitions":
    { "example": [ "this", "is", "just", "a", "list", "no", "json-schema", "specific" "context" ]
    }
, "type": "object"
, "properties":
    { "prop1":
        { "enum":
            [ "one"
            , "two"
            , { "extendref": "#/definitions/example" }
            , "final"
            ]
        }
    }
, "required": "prop1"
}

How the merge can preserve semantic meaning in an agnostic fashion, without any schema duplication:

{ "$schema": "http://json-schema.org/draft-04/schema#"
, "definitions":
    { "type_base_properties":
        { "bp1": { ... }
        , "bp2": { ... }
        , "bp3": { ... }
        }
    , "type_base_required":
        { "required": [ "bp1", "bp2", "bp3" ]
        }
    , "type_child1_properties":
        { "c1p1": { ... }
        , "c1p2": { ... }
        }
    , "type_child1_required":
        { "required": [ "c1p1", "c1p2" ]
        }
    , "type_child2_properties":
        { "c2p1": { ... }
        }
    , "type_child2_required":
        { "required": [ "c1p1", "c1p2" ]
        }
    }
, "type": "object"
, "oneOf":
    [   { "title": "Child Type #1"
        , "description": "Merge Anywhere, Agnostically, at your local Type #1"
        , "properties":
            { "props_base":  { "$extendref": "#/definitions/type_base_properties" }
            , "props_child": { "$extendref": "#/definitions/type_child1_properties" }
            }
        , "required":
            [ { "$extendref": "#/definitions/type_base_required" }
            , { "$extendref": "#/definitions/type_child1_required" }
            ]
        }
    ,   { "title": "Child Type #2"
        , "description": "Merge Anywhere, Agnostically, at your local Type #2"
        , "properties":
            { "props_base":  { "$extendref": "#/definitions/type_base_properties" }
            , "props_child": { "$extendref": "#/definitions/type_child2_properties" }
            }
        , "required":
            [ { "$extendref": "#/definitions/type_base_required" }
            , { "$extendref": "#/definitions/type_child2_required" }
            ]
        }
    ]
}

Aren't these neat? Using json-path rather than json-pointer in $extendref grants even more flexibility. Unfortunately, json-path does not allow extracting keys from objects, so there are no means of automatically marking all properties as required.

Generalizations

Statements like these I believe are wrong:

I believe that any keyword which adds semantic meaning to JSON Schema, which is what your proposal does, is a mistake.

All keywords have semantic meaning. They are required to:

The following strings, when present in the root of any schema (including subschemas), all have semantic meaning pertaining soley to the structure of the schema.

 "$schema", "title", "description", "type", "properties", "required" ...

A schema can't exist without these keywords granting structure to the schema.

Then there are the following keywords which also provide semantic meaning completely separate from json-schema, with no overlap:

 "$ref" "$merge" "$patch"

Then there are keywords which blur the line between schema and json data:

 "$data"

Finally there is json data. It requires an interpreter; by itself it has no semantic meaning.

So you see, all keywords have semantic meaning. I assume you object to keywords that correlate (blur boundaries between) areas of disparate functionality:

  • "B.A.T.M.A.N.": mixing OSI layer 3 with OSI layer 2 (random example for perspective)
  • "$data": mixing JSON data and JSON Schema

However I believe my proposal concerns itself purely with merging JSON data. I would categorize "$extendref" with "$ref "$merge" "$patch". It has no concept of json-schema.

That said, it would have to be part of the json-schema specification. Perhaps it can be split out later.

IsValid(data, {"allOf": [A, B]}) would no longer be the same as isValid(data, A) && isValid(data, B)
-- (@geraintluff)

This is trying to define semantics of json-schema in a way that don't make sense. I believe it is logically intuitive to expect isValid(data, {"allOf": [A,B]}) != (isValid(data, A) && isValid(data, B)), and that schema combiners should be modelled after multiple-inheritance of OOP. This statement seems to be confusing the issue of isinstance with issubclass.

@orbisvicis
Copy link
Author

This is probably a good time to define/decide what is mergeable via $extendref.

Container considerations:
  1. implicit type coercion:
    • dict -> list: extract keys only (json-path can extract values already)
  2. type enforcement
    • dict -> dict
    • list -> list
Primitive types consideration:
  1. It's probably best to disallow any type conversions here.

@fge
Copy link

fge commented Jan 22, 2015

Sorry but I still don't agree with this alternative. It attaches semantic meaning to JSON Schema, which it is not meant to have in the first place!

@orbisvicis
Copy link
Author

I don't see how you can say that. I actually misunderstood your $merge/$patch proposal. I assumed you were introducing a top-level schema keyword and not $merge/$patch keywords that are valid at any level of the json data (because neither proposal posted to google groups mentioned $merge/$patch). It finally clicked though as I was rereading your posts here, and now I can't get over how similar they are.

The only differences

  • $merge/$patch are specified at the top-level of the structure to patch. $extendref is specified within the structure to patch, ie $extendref inverts $merge/$patch.
  • $extendref supports merging arrays, $merge/$patch doesn't. This is a conscious decision; the merge-patch authors could have easily chosen to support array merging.
  • $extendref preserves order when merging arrays
  • $extendref only supports references, $merge/$patch supports inline json.
  • I find $extendref clearer and more concise (personal preference)

That said, I proposed $extendref in response to some objections made by @geraintluff: at Fishing for Features: v5 round-up and Proposals for "merge", "patch" and "strictProperties":

  • "... extra pre-processing step ..."
  • "... resuting JSON Schema produced by the merge has no definite URI ..."

But now I just found a whole new set of objections - Inheritance in JSON Schema - so I would really like @geraintluff to weigh in on $extendref.

@handrews
Copy link

@orbisvicis :

I have incorporated ideas from this issue into a larger thread on various additionalProperties and additionalProperties-caused issues which you may find here: https://groups.google.com/forum/#!topic/json-schema/dDkozAUTxRk

Work on this project has moved to https://github.com/json-schema-org/json-schema-spec , so could you please close this?

Issues will eventually be filed in the new repo from that google groups thread, but if there is something specific you want to continue to discuss please feel free to file in the new repo yourself.

@Julian Julian closed this as completed Dec 29, 2016
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants