Skip to content

JSON-LD provides no means of disambiguating CURIEs and IRIs #429

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
Josh-Tilles opened this issue Oct 6, 2016 · 3 comments
Closed

JSON-LD provides no means of disambiguating CURIEs and IRIs #429

Josh-Tilles opened this issue Oct 6, 2016 · 3 comments

Comments

@Josh-Tilles
Copy link

Josh-Tilles commented Oct 6, 2016

I previously created a GitHub issue and mailing list post on this topic, but walked away under the impression that there was no official solution. However, I recently stumbled across section 2.2 of the official CURIE documentation, which anticipated this exact issue! I’ll reproduce the relevant content here for convenience:

CURIEs and SafeCURIEs map to IRIs, but neither a CURIE nor a Safe_CURIE is an IRI or URI. Accordingly, CURIEs and Safe_CURIEs MUST NOT be used as values for attributes or other content that are specified to contain only URIs, IRIs, URI-references, IRI-references, etc. Specifications for particular attribute values or other content MAY be written to allow either CURIEs or IRIs (or URIs, etc.). The specifications for such languages MUST provide rules for disambiguation in situations where the same string could be interpreted as either a CURIE or an IRI. One way to do this is to require that all CURIEs be expressed as Safe_CURIEs, implying that all unbracketed strings are to be interpreted directly as IRIs.

For JSON-LD to adopt SafeCURIEs across the board would probably be a significant breaking change—something that I don’t want to treat lightly. However, given that work on JSON-LD is starting up again, I think it’s worth beginning a discussion.

 "`@id`s are vulnerable to unwelcome expansion"

 "How to mitigate accidental/unwelcome IRI expansion?"

 "Ambiguities Between CURIEs and URIs"

 "Reactivating the CG to work on updated versions of the specs"
@dlongley
Copy link
Member

dlongley commented Oct 6, 2016

Not trying to be inflammatory here -- but my view is that allowing CURIEs in JSON-LD data was a mistake in version 1.0. I think that their use within @context is probably fine, but we should have limited the data to terms only. This would have fit better with idiomatic JSON and would have encouraged people to publish more JSONic looking data. That JSON-LD data has been published using CURIEs has led some Web developers to think that all JSON-LD must look that way and, given that it's a strong deviation from typical usage, it has caused disinterest or even hostility towards the syntax.

Furthermore, limiting the data syntax to terms would have eliminated a number of ugly performance issues and unexpected compaction behavior. I'm loathe to introduce anything else related to CURIEs to the data portion of a JSON-LD document ... in fact, I'd perhaps lobby to deprecate their use there -- or, at the very least, discourage it strongly.

Note that we also have avoided "microsyntaxes" (which is what would be used with a 'SafeCURIE') and that, in my view, has been demonstrated to be the right decision.

@Josh-Tilles
Copy link
Author

@dlongley upon rereading the end of what I wrote, I realize I might not have been clear. I was not intending to begin the discussion already advocating one particular solution (namely, SafeCURIEs); I definitely think we’d be better off fleshing out a problem statement of sorts first.

@gkellogg
Copy link
Member

gkellogg commented Oct 6, 2016

Note that JSON-LD doesn't precisely use CURIEs (certainly not SafeCURIEs), at least as defined in RDFa. It's probably closer to a Turtle PName, with the difference being that the software must disambiguate between a Compact IRI and a relative or absolute IRI, but the IRI expansion algorithm makes this unambiguous, as a Compact IRI will have the first component defined as a term.

I do think that Compact IRIs have their place, but mostly when the JSON-LD is closely related to another RDF format, such as Turtle. In that case, there may be a number of properties for which an explicit term definition doesn't make sense. If we abandoned the Compact IRI capability, the only alternative would be an Absolute IRI (or a a string which is treated as being relative to @vocab or @base). For an RDF encoding, this would be a disadvantage.

What I think is a problem, is that a term can be defined which looks like a Compact IRI (or even an absolute IRI, for that matter); this leads to some unfortunate compaction steps. Perhaps we could encourage that terms SHOULD be a URI isegment-nz-nc:

   isegment-nz-nc = 1*( iunreserved / pct-encoded / sub-delims
                        / "@" )
                  ; non-zero-length segment without any colon ":"

The issue you site does pose an issue (but not an ambiguity), as tag is both used as a scheme and a term.

Note that there's a proposal (#426) to create default term content, which could include an @context which applies specifically to values of a term. As written, I don't think this applies to string values, which might be interpreted as IRIs, but it could pretty easily do that. So, for example, you might have something like the following:

{
  "@context": [
    "http://www.w3.org/ns/activitystreams",
    {
      "id": {"@id": "@id", "@content": {"@context": {"tag": null}}}
    }
  ],
  "id": "tag:search.twitter.com,2005:593895901623496704",
  "@type": "Create",
  "url": "http://twitter.com/KidCodo/statuses/347769243409977344",
  "actor": {
    "@context": {"id": null},
    "id": "id:twitter.com:2993982541",
    "@type": "Person",
    "displayName": "Kid Codo",
    "url": "http://www.twitter.com/KidCodo",
    "image": "https://si0.twimg.com/profile_images/3664410292/1d75c213a572873bf6797c5591475da5_normal.jpeg"
  }
}

This would have the effect of undefining the term tag just for values of id, which is also expanded to @id.

@gkellogg gkellogg added the 1.1 label Oct 6, 2016
@gkellogg gkellogg added this to the JSON-LD.next milestone Oct 6, 2016
@gkellogg gkellogg added syntax and removed 1.1 labels Oct 6, 2016
gkellogg added a commit to gkellogg/json-ld.org that referenced this issue May 30, 2017
gkellogg added a commit that referenced this issue May 30, 2017
…th a URI scheme or a Compact IRI.

Fixes #429.

(cherry picked from commit 1069c14)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants