Skip to content

Schema vs @context #474

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
David-Chadwick opened this issue Mar 26, 2019 · 18 comments
Closed

Schema vs @context #474

David-Chadwick opened this issue Mar 26, 2019 · 18 comments
Assignees
Labels
clarification Non-normative clarifications of spec text pending close Close if no objection within 7 days
Milestone

Comments

@David-Chadwick
Copy link
Contributor

We now have two ways of saying which schema defines the properties in a VC and VP: the credentialSchema property and the @context property. It is not clear to me which should be used when. I think we need to add some clarification text to the CR to either give an indication of which property should be used to define which schema items in a VC, or to say there are no rules and it does not matter which is used, and the verifier must check both properties in order to collect the full set of schema definitions.
I note that @contexts can be nested in JSON LD. Is this also true for credentialSchema properties? Or is this one of their differences?

@brentzundel
Copy link
Member

As I understand it, they are already quite different as they're defined in the spec.

The @context property defines a common vocabulary. E.g. this allows a VC to have the string VerifiableCredential as a value, which a processor can map to the value https://w3.org/2018/credentials#VerifiableCredential
Really, a context just defines a dictionary of name-value pairs.

The credentialSchema property defines a data schema, i.e. a schema that defines the structure of the data in the credential. The data schema would define the property names and acceptable value types one might expect to find in the credentialSubject section of the VC. E.g. If the VC should contain a hireDate property that should be of type RFC3339 the credentialSchema is where that would be specified.

I hope that clarifies things.

@burnburn burnburn added this to the CR-Exit milestone Mar 27, 2019
@burnburn burnburn added the clarification Non-normative clarifications of spec text label Mar 27, 2019
@David-Chadwick
Copy link
Contributor Author

@brentzundel The data model lived without the credentialSchema property until ZKP was added. The @context contained all the schema information that was necessary for extensibility and understanding the contents of a VC.
So we have a disconnect here that needs clarification.

@brentzundel
Copy link
Member

I think you may be confusing @context with type.

@context defines a common vocabulary for those who are willing to do some JSON-LD processing.

In the spec, type has this statement:

Software systems that process the kinds of objects specified in this document use type information to determine whether or not a provided credential or presentation is appropriate.

I agree that type and credentialSchema are very similar. The difference between them (as far as I can tell) is that type MUST be present, and MUST be a URL, but that URL doesn't actually need to contain anything useful. Where credentialSchema MUST be one or more data schemas that provide verifiers with enough information to determine if the provided data conforms to the provided schema. Each credentialSchema MUST specify its type (for example, JsonSchemaValidator2018), and an id property that MUST be a URI identifying the schema file.

So, tl;dr, the credentialSchema and type properties are very similar in what they may contain, but the type property doesn't actually need to contain anything, while the credentialSchema does.

I still don't see any similarity between credentialSchema and @context.

@David-Chadwick
Copy link
Contributor Author

We certainly have quite a significant difference in understanding, so it is clear that clarification is needed to the standard if two people who are intimately involved in producing it have such differences in opinion (its a pity for those who are reading the standard for the first time :-).
Type says what type of VC this is, but this URL may not provide a definition of this type (as you admit, it need not contain anything useful). As a minimum one can consider the type to simply be a globally unique string so that two different user groups wont define the same type to be different things.
E.g. consider the type says (as a URL) this is university degree VC; but this does not say what properties a university degree VC must or may contain. This is found elsewhere, in either the @context or the credentialSchema properties (or both). Both of these define the set of properties that the VC may contain. But at least one of them must in addition say that a university degree VC must or may contain a qualification property and a classification property (or whatever the university degree type is defined to mean). Furthermore the @context must define all the basic properties that any VC may contain (such as terms of use, expiration date, ID, type etc.), along with their aliases, since @context has been in the specification from the start (way before credentialSchema was invented).

According to the specification, credentialSchema may define two different types of schemas:
Data verification schemas, and data encoding schemas. The former directly replicates the @context property in my opinion.
So my questions to you are
i) where is the type of the VC defined (e.g. university degree VC), in the @context or credentialSchema or both
ii) what is the difference between @context and credentialSchema apart from the latter specifies data encodings and the former does not (AFAIK).

@brentzundel
Copy link
Member

i) where is the type of the VC defined (e.g. university degree VC), in the @context or credentialSchema or both.

the type of the VC is defined in the type property

ii) what is the difference between @context and credentialSchema apart from the latter specifies data encodings and the former does not (AFAIK).

@context does not define properties. All @context defines is a vocabulary; it is a list of strings that can be used in place of longer strings. @msporny or @dlongley or @dmitrizagidulin please confirm this.

credentialSchema defines the properties and value types that should be expected in the credential. This allows a credential to be automatically processed to see if the properties and values it contains match the properties and value types specified in the credentialSchema.

@David-Chadwick
Copy link
Contributor Author

@brentzundel Sorry, let me ask my question i) again, as I realise it was ambiguous.

i) where is the type defined?
the type property defines the type of the VC, but it does not define the type itself. i.e. where is it stated that a VC of type X must contain properties A and B and may contain property C.

@dlongley
Copy link
Contributor

@David-Chadwick,

where is the type defined?
the type property defines the type of the VC, but it does not define the type itself. i.e. where is it stated that a VC of type X must contain properties A and B and may contain property C.

There are a few ways of doing this. The type itself is a URL (when resolved via the mappings in @context). To define the type itself you can:

  1. Specify it in some human readable spec and require people to read it and add the rules to their application. (This should probably be considered legacy approach).
  2. Return machine readable information from the type URL that directly or indirectly (through "follow your nose" linked data) specifies the information.
  3. Specify a credentialSchema property with a vocabulary document and/or schema that has the information.

Note that the latter two mechanisms can include content integrity proofs via other technologies like hashlinks or DLT and can be implemented with caching/local copies as use cases demand.

@brentzundel
Copy link
Member

Thank you @dlongley for saying better what I was trying to.

@David-Chadwick
Copy link
Contributor Author

@dlongley Thankyou. Can you also answer this issue that I raised above "the @context must define all the basic properties that any VC may contain (such as terms of use, expiration date, ID, type etc.), along with their aliases, since @context has been in the specification from the start (way before credentialSchema was invented)."
Or were we mistaken in originally thinking that @context was sufficient?

@dlongley
Copy link
Contributor

@David-Chadwick,

The @context provides mappings to all of the properties defined in the spec (we also have a companion vocabulary document in JSON-LD ... so we cover options 1 and 2 here for defining terms). This is sufficient to allow people and machines to find definitions, but we also provided the credentialSchema property as a third way. So @context was "sufficient" as you say, but other mechanisms were requested that allowed direct links from the VC itself to ZKP-style schemas or other json schemas, etc.

@David-Chadwick
Copy link
Contributor Author

@dlongley Thankyou yet again!
So can I summarise that the benefit of the credentialSchema property over the @context property (ignoring the data encoding schema which @context does not cover) is that it provides direct links to the schema definitions rather than possibly many levels of indirection via the @context property.

If this is the case then it would be beneficial to add an explanatory note to the standard to explain the difference (which was the original purpose of me raising this issue).

@dlongley
Copy link
Contributor

dlongley commented Mar 30, 2019

@David-Chadwick,

I think another benefit is that it provides an opportunity to annotate type definitions or "lock them" to specific versions of the vocabulary. Dereferencing type URLs, for instance, may give you a "live version" of the vocabulary. Depending on how stable and popular such a vocabulary is, authors of VCs may prefer to include a "static" version of their vocabulary via credentialSchema that is locked to some content integrity hash. I think this is what should be highlighted as the main reason for using it, IMO.

@brentzundel
Copy link
Member

@David-Chadwick
I (and others) have tried to clarify this difference between @context and credentialSchema with comments on this issue and on PR #533, which adds a note to the Data Schemas section to clarify the intended use of the credentialSchema property.

If that PR does not address your issue, please recommend some concrete changes to the data model that would address your issue so that this can be a more productive conversation.

@David-Chadwick
Copy link
Contributor Author

My objective is to make the CR understandable to implementors and to remove ambiguities and lack of clarity. So this is what we need in my opinion:

  1. A clear definition of what the @context property is, what its values are, what its purpose is and how it must be interpreted. (Surely it cannot be ignored).
    This definition has to apply to both JSON processors and JSON-LD processors, which may interpret it differently. So if it has different purposes for the two processes, then the differences need to be spelt out clearly and succinctly with no vague hand waving.
  2. A explanation of what the implications of the @context is on other properties in a VC, such as the type property (e.g. it allows aliases of URLs). This needs to be made crystal clear for both JSON and JSON-LD processors.
  3. If the @context points to shifting definitions, again this has to be made very clear in the spec.
  4. How @context aids extensibility and interworking.
  5. What the deficiencies in @context are (if there weren't any we would not need credentialSchema for specifying the ontology)

Then for the credentialSchema property

  1. what it provides in addition to the @context property
  2. what are the implications for implementations that do not have a credentialSchema property. This needs to cover those that use JSON processors as well as JSON-LD processors.

Finally we should provide examples of the more complex concepts e.g. if we say that credentialSchema restricts or makes immutable the definitions in @context, then provide an example of this. (I dont know how you make a definition immutable - a definition that changes is not really a definition is it?)

@agropper
Copy link

agropper commented Apr 9, 2019 via email

@msporny
Copy link
Member

msporny commented Apr 10, 2019

  1. A clear definition of what the @context property is, what its values are, what its purpose is and how it must be interpreted. (Surely it cannot be ignored).
    This definition has to apply to both JSON processors and JSON-LD processors, which may interpret it differently. So if it has different purposes for the two processes, then the differences need to be spelt out clearly and succinctly with no vague hand waving.

I believe PR #535 does this: https://github.com/w3c/vc-data-model/pull/535/files

  1. A explanation of what the implications of the @context is on other properties in a VC, such as the type property (e.g. it allows aliases of URLs). This needs to be made crystal clear for both JSON and JSON-LD processors.

Fine with me. Which section should we put this clarification in?

  1. If the @context points to shifting definitions, again this has to be made very clear in the spec.

What is your definition of "shifting definitions"? My concern here is that we're going to duplicate large sections of the JSON-LD specification, would pointing to the JSON-LD specification help? Every time we've done that in the past, we've gotten a raft of "I shouldn't have to read the JSON-LD specification" comments in and have duplicated text to avoid those sorts of issues raised against the specification.

  1. How @context aids extensibility and interworking.

Is this not what this section does? https://w3c.github.io/vc-data-model/#contexts

Would pointing to this section of the JSON-LD specification work for you?

https://www.w3.org/TR/json-ld/#the-context

If doing that doesn't accomplish what you want, what concrete specification text could we add to address your concerns?

  1. What the deficiencies in @context are (if there weren't any we would not need credentialSchema for specifying the ontology)

I'm fine with doing this. Please suggest some concrete specification text.

Then for the credentialSchema property

  1. what it provides in addition to the @context property
  2. what are the implications for implementations that do not have a credentialSchema property. This needs to cover those that use JSON processors as well as JSON-LD processors.

This is covered in https://w3c.github.io/vc-data-model/#content-integrity-protection -- would linking to that section address your concerns? If not, what concrete text could we add and to what section to address your concerns?

@msporny msporny removed the pr exists label Apr 16, 2019
@stonematt
Copy link
Contributor

From VCWG call on April 30, 2019:
RESOLUTION: Issue #474 should be addressed by making a set of non-normative changes to the specification that clarifies the use of @context, the differences between it and the credentialSchema property, with appropriate references to more detailed explanations in the specification and other specifications. Issue #474 should be closed after the PR is merged.

@stonematt stonematt added the ready for PR This issue is ready for a Pull Request to be created to resolve it label Apr 30, 2019
@msporny msporny removed the ready for PR This issue is ready for a Pull Request to be created to resolve it label May 7, 2019
@burnburn burnburn added the pending close Close if no objection within 7 days label May 7, 2019
@burnburn
Copy link
Contributor

PR approved 7 days ago, merged 3 days ago. Closing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clarification Non-normative clarifications of spec text pending close Close if no objection within 7 days
Projects
None yet
Development

No branches or pull requests

7 participants