Skip to content

Yet Another JSON-LD the protocol spec to use? #52

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
tkanai opened this issue Jul 8, 2015 · 28 comments
Closed

Yet Another JSON-LD the protocol spec to use? #52

tkanai opened this issue Jul 8, 2015 · 28 comments

Comments

@tkanai
Copy link
Contributor

tkanai commented Jul 8, 2015

The protocol spec (4.1.2) says:

The JSON-LD serialization of the Container's description should use the Open Annotation's context, http://www.w3.org/ns/oa. (Additional Constraint)

Are there any strong reasons to recommend using the annotation context URI? Technically speaking, the json in the example 3 would be equivalent to the json below, as far as it follows both JSON-LD and LDP specs.

{
  "@context": {"ldp: "http://www.w3.org/ns/ldp#",
  "rdf" : "http://www.w3.org/1999/02/22-rdf-syntax-ns#",
  "iana" : "http://something"
  },
  "@id": "http://example.org/annotations/",
  "@type": "ldp:BasicContainer",
  "rdf:label": "A Container for Open Annotations",
  "iana:alternate": ["http://example.org/annotations2/", "http://example.org/moreAnnotations/"],
  "ldp:contains": ["anno1","anno2","anno3","anno4"] 
}

I'm afraid that the line just contribute to introduce unnecessary confusions. Is it possible to remove the line?

@iherman
Copy link
Member

iherman commented Jul 8, 2015

@tkanai I think having only one context file greatly removes the complexity of the usage of JSON-LD. Indeed, instead of forcing the user to list all those prefix declarations separately and explicitly (which leads to the source of confusions that was a plague in namespace usage) it hides all those in one place so that most of the users would not care about those details. I personally consider that a major plus...

@azaroth42
Copy link
Collaborator

And, in fact, the JSON you included would be completely compatible with using the context [once it includes iana and ldp] as the requirement for using particular keys comes from a profile, not the context itself.

The reason for requiring the particular shape of the JSON-LD is to enable regular JSON based clients to understand and process the content without requiring a JSON-LD, or worse full RDF, stack. At the same time the great advantage of JSON-LD is that it is RDF underneath the covers so people wanting to do linked data can also make use of it at the same time without multiple representations.

So I'm 👎 to making the serialization form unrestricted as this makes client development harder. See also #51 regarding constraining HTTP level features at the protocol level.

@dret
Copy link
Member

dret commented Jul 9, 2015

so what if developers use JSON-LD libraries that provide no control over producing the specific "shape" required by the spec? i have too little knowledge of JSON-LD libraries to know about this. but i do know that this is a real problem in an identical area where there were specs that required certain forms of XML serialization, only to later discover that when using standard XML software (which more often than not does not provide full control over all details of serialization), for some developers it was simply impossibly to satisfy this additional constraint on top of standard XML. like i said, i am not sure this is an issue here, but it may be something to keep in mind and something that has caused many real problems in earlier occurrences of the same "let's constrain the standard to make client development easier" approach. postel's law still is something to keep in mind for open systems.

@tilgovi
Copy link
Contributor

tilgovi commented Jul 9, 2015

I don't think this is a problem for JSON-LD. JSON keys are un-ordered and JSON-LD specifies only a couple standard "shapes": expanded, compacted and flattened.

@tkanai
Copy link
Contributor Author

tkanai commented Jul 9, 2015

@azaroth42 You are right. We "can" use the annotation context URI and make a valid JSON-LD. It is not necessary to list up all URIs in JSON-LD packets. On the other hand, as you admitted, the JSON-LD I wrote is also the valid data from grammatical view point. The Question is whether the JSON-LD I wrote is.valid as Annotation response or not. According to the spec, it sounds to me that the response is invalid because it does not describe the annotation URI, as context.

The data was published from a LDP server while I was evaluating the protocol spec (I didn't put the iana staff though) Do I have to turn it to publish the annotation URI context only? Will the updated server pass the LDP tests?

As @dret suggested, I think utilizing LDP properties, or available existing properties as much as possible is a huge plus to accelerate the adaption of Web Annotation protocol, and I think that is why Web Annotation Protocol is based on LDP, am I wrong?

@azaroth42
Copy link
Collaborator

The question is not "What happens if developers are using json-ld libraries", but the infinitely more common scenario of "What happens if developers are using json libraries". And in that 99% scenario, they would be unable to process the context to determine the semantics of the new keys in the object.

So, in my opinion, the benefit of not requiring an entire RDF stack (in the form of JSON-LD) while still being 100% compatible with it, outweighs any minor difficulty in applying supplied contexts and frames.

The alternative (which I am absolutely not against) is to completely embrace linked open data and make decisions entirely based on technical validity, rather than trying to cater to ease of development. In that mode, we could be much less constraining.

@tilgovi
Copy link
Contributor

tilgovi commented Jul 9, 2015

In discussing plain JSON I think we should focus on production rather than consumption. If you're a plain JSON system you're probably not thinking too hard about interop. You're just publishing and consuming your own annotations. By having a default context and recommended keys we hope that JSON-LD clients will be able to consume these annotations.

However, since producing annotations with your own context and keys would constitute valid JSON-LD we cannot hope that JSON clients will consume them always.

All of this just amounts to a recommendation to use a particular context and keys, but it doesn't force anyone in the linked data world to do so.

@azaroth42
Copy link
Collaborator

@tilgovi I disagree... I think there will be many more consuming systems written than producing systems, and unless we're assuming that every web browser will have a JSON-LD stack built in, we would be deciding from the outset that there would not be browser-native annotation support for the foreseeable future. I think that would be a great shame, and hence a mistake to progress down that path.

So ... by having a recommended context and shape, we hope that JSON clients will be able to consume the annotations. A JSON-LD client would be able to consume them regardless of consistent context or not, because it would turn the annotations into an RDF graph and then process the graph.

For an example of what happens when you don't use the same context, compare the Open Annotation context annotations [1] with the IIIF context annotations [2]. Exactly the same model and graphs, but quite a different look and feel just from changing a few of the mappings. Without fixing the serialization, conforming clients would have to support both of these. My experience to date is that work is always put off and put off... I'm sure I don't need to point fingers here :) ... so making it more complex is just going to have it put off indefinitely.

[1] e.g. http://openannotation.org/spec/core/publishing.html#Serialization
[2] e.g. http://iiif.io/api/presentation/2.0/#other-content-resources

@dret
Copy link
Member

dret commented Jul 9, 2015

in the end, the simple question is the processing model of the media type: if i am processing the media type, how to i get from a serialization to a parsed model i can safely work with and robustly code against? the processing model then also tells me how i have to or how i can serialize and safely assume that different implementations will end up with the same understanding of what i have serialized.
the hard part is to think through open scenarios, such as a producer with a weirdo JSON library that produces strange but legitimate JSON. will all consumers (both those using JSON and those using JSON-LD toolsets) understand it in the same way? if not, something is broken, and the fix then is to either outrule the strange JSON (but then you are constraining JSON and that's not so great), or to tweak the processing model so that consumers indeed will get the same understanding.
the tricky stuff usually happens across implementations, and even more so if they are built on different metamodels, one working in plain JSON only, and the other one by a team preferring RDF and using JSON-LD as their databinding. thinking through scenarios where these two interact, and both of them explore the limits of what they're allowed to do per spec, usually is a very educating exercise.
for the activity streams WG, we face the exact same problem: JSON-LD is used because some people like the fact that it's JSON, and another set of people like the fact that it's also RDF (but only if you get the tricky parts right that establish the RDF view such as using the right context). ideally, we should have defined everything in terms of pure JSON, so that people only dealing with JSON could read the spec and never even read about the RDF view. and then a separate spec could tell those interested in an RDF view of everything how to do this robustly on top of the JSON. we haven't managed to be that clean and clear, but at least logically speaking, this is the way to go.

@tilgovi
Copy link
Contributor

tilgovi commented Jul 9, 2015

Soo... is there any reason to adopt a change suggested by this issue? I think not. I would actually support the idea brought up just now by Erik wherein the spec sticks to pure JSON with some additional text about how to augment it for JSON-LD / RDF world (essentially, slap the context on it). I would love it if we recommended that anyone producing JSON-LD stick to the recommended context to make interop easier.

Dropping our context from the examples seems like a bad idea unless we're dropping all context from the examples and trying to really push a particular set of keys.

@dret
Copy link
Member

dret commented Jul 9, 2015

because i am OCD: is there even a well-defined model for what a "RDF view" of some JSON-LD is if there is an implicit context, but the JSON-LD inlines or references a context with conflicting definitions? i am really just curious here, and have asked this question for the JSON-LD experts: json-ld/json-ld.org#391

@tkanai
Copy link
Contributor Author

tkanai commented Jul 10, 2015

I think I understand the situation, but I'm still wondering why doesn't the spec ask JSON clients to send "GET" request with "application/json" as accept format, if JSON is that important?
Annotation servers which are fully customized for Annotation protocol will be capable to accept both "application/json" and "application/ld+json", with thanks to the Annotation context URI. Some LDP based server systems might be needed to add Json, or Annotation-JSON, serialization function, but it will be still capable to handle any other json-ld requests.

@iherman
Copy link
Member

iherman commented Jul 10, 2015

@tilgovi said

Soo... is there any reason to adopt a change suggested by this issue? I think not. I would actually support the idea brought up just now by Erik wherein the spec sticks to pure JSON with some additional text about how to augment it for JSON-LD / RDF world (essentially, slap the context on it). I would love it if we recommended that anyone producing JSON-LD stick to the recommended context to make interop easier.

This could also require (which I think would be a good idea) that the protocol would also return a link header to the appropriate @context following the relevant section of the JSON-LD spec. It is not a major addition to the protocol, the extra load on Annotation Servers is minimal, but purely JSON clients would not have to actively disregard the @context property from the returned data (as they would have to do now).

@azaroth42
Copy link
Collaborator

ideally, we should have defined everything in terms of pure JSON, so that people only dealing with JSON could read the spec and never even read about the RDF view. and then a separate spec could tell those interested in an RDF view of everything how to do this robustly on top of the JSON.

This would argue in favor of splitting Model and Serialization into two separate documents. Serialization could then focus exclusively on the JSON format, with reference to the model.

However it does not affect protocol, as we inherit the MUST from LDP of support for the turtle syntax, and thus RDF.

the extra load on Annotation Servers is minimal, but purely JSON clients would not have to actively disregard the @context property from the returned data

I don't follow the logic here. By adding a real implementation requirement to the server, we prevent the client from having to ignore something that it's clearly going to ignore anyway, and indeed required to ignore by the relevant specifications? 👎

@azaroth42
Copy link
Collaborator

@dret asked:

because i am OCD: is there even a well-defined model for what a "RDF view" of some JSON-LD is if there is an implicit context, but the JSON-LD inlines or references a context with conflicting definitions?

Yes, any subsequent @context definitions override and previously encountered ones. As a JSON object cannot have the same key twice, there's no collisions where the order of keys would matter.

For example:

{
  "@context": {"label": "rdfs:label", "related": "dc:relation"},
  "label": "This label is rdfs:label",
  "related": {
    "@context": {"label": "dc:title"},
    "label": "This label is dc:title"
  }
}

And in the JSON-LD playground: http://tinyurl.com/pr4dtgf

An outstanding issue is that in the expansion/compaction routines, the context nodes are lost. Compaction can only take a single context statement which is applied at the top level, so the above structure is, unfortunately, not round-trip capable in the JSON-LD 1.0 API.

See: https://lists.w3.org/Archives/Public/public-linked-json/2014Jul/0030.html
and: json-ld/json-ld.org#356

@dret
Copy link
Member

dret commented Jul 10, 2015

since i am OCD:

On 2015-07-10 14:42, Rob Sanderson wrote:

As a JSON object cannot have
the same key twice, there's no collisions where the order of keys would
matter.

oh yes it can: https://tools.ietf.org/html/rfc7159#section-4

that's why I-JSON exits: http://tools.ietf.org/html/rfc7493#section-2.3

@tilgovi
Copy link
Contributor

tilgovi commented Jul 10, 2015

Truth, but that text also answers your question. What happens when it's ambiguous? Don't do that, it's unspecified and you'll totally break interop!

@azaroth42
Copy link
Collaborator

Touché! 😸

From http://www.w3.org/TR/json-ld/:

JSON object
An object structure is represented as a pair of curly brackets surrounding zero or more key-value pairs. A key is a string. A single colon comes after each key, separating the key from the value. A single comma separates a value from a following key. In contrast to JSON, in JSON-LD the keys in an object must be unique.

So the valid JSON object:

{
  "@context": {"label": "rdfs:label"},
  "label" : "fish",
  "@context": {"label": "dc:title"},
  "label": "bat"
}

Is not valid JSON-LD. The playground asserts <> dc:title "bat", for what it's worth.

@dret
Copy link
Member

dret commented Jul 10, 2015

that's exactly the example i played around with yesterday when i tried to better understand how edge cases work, and the playground's result confused me quite a bit. so thanks for pointing to the JSON-LD spec that explains this behavior. fwiw, i think the playground should throw an error on this one, because that's what the spec says, right?

@tilgovi
Copy link
Contributor

tilgovi commented Jul 10, 2015

I think the playground is based on the javascript JSON-LD implementation.

The built-in JSON.parse silently discards duplicate keys so I'm totally unsurprised that the playground doesn't notice. Doing so would require doing the parsing in JavaScript which would be a massive performance hurt.

@iherman
Copy link
Member

iherman commented Jul 11, 2015

On 10 Jul 2015, at 23:32 , Rob Sanderson [email protected] wrote:

ideally, we should have defined everything in terms of pure JSON, so that people only dealing with JSON could read the spec and never even read about the RDF view. and then a separate spec could tell those interested in an RDF view of everything how to do this robustly on top of the JSON.

This would argue in favor of splitting Model and Serialization into two separate documents. Serialization could then focus exclusively on the JSON format, with reference to the model.

However it does not affect protocol, as we inherit the MUST from LDP of support for the turtle syntax, and thus RDF.

the extra load on Annotation Servers is minimal, but purely JSON clients would not have to actively disregard the @context property from the returned data

I don't follow the logic here. By adding a real implementation requirement to the server, we prevent the client from having to ignore something that it's clearly going to ignore anyway, and indeed required to ignore by the relevant specifications?

Do not understand this last point. If the returned JSON payload does not include @context, it is easier on pure JSON based implementations, while JSON-LD implementations go on unchanged because they get the @context through the header. I do not think it is the same

Ivan


Reply to this email directly or view it on GitHub.


Ivan Herman, W3C
Digital Publishing Activity Lead
Home: http://www.w3.org/People/Ivan/
mobile: +31-641044153
ORCID ID: http://orcid.org/0000-0003-0782-2704

@gobengo
Copy link

gobengo commented Jul 11, 2015

I think framing is supposed to help with some of the concerns here, if I'm skimming correctly. It helps with the "How do I go from arbitrary, but valid, JSON-LD to a particular 'shape' of json"
http://json-ld.org/spec/latest/json-ld-framing/

@azaroth42
Copy link
Collaborator

Do not understand this last point. If the returned JSON payload does not include @context, it is easier on pure JSON based implementations, while JSON-LD implementations go on unchanged because they get the @context through the header. I do not think it is the same.

It involves more work on the server side to add the header, more work on the JSON-LD client side to retrieve the header, and doesn't save any work on the JSON client side, as it must just ignore the context key that it doesn't understand anyway. No one is going to write code looking for structure that they then can't process. There's no savings in transferred bytes, as the context is there in the HTTP headers, just much less accessible to systems that do want it.

The header option is designed to allow previous non-json-ld systems to assert their context without changing existing representations.

@tilgovi
Copy link
Contributor

tilgovi commented Jul 13, 2015

Agreed. It's not likely to present much of a problem to most JSON clients if there's an extra key that they can ignore.

@azaroth42
Copy link
Collaborator

Implementation feedback sought. I think the only way to close the issue is to act from a position of knowledge, informed by practical usage rather than theory.
[meta ... am creating a implementation feedback tag :)]

@azaroth42
Copy link
Collaborator

Propose close won't fix.

@iherman
Copy link
Member

iherman commented Sep 10, 2015

ok

@azaroth42
Copy link
Collaborator

Closing. This is also in alignment with SocialWeb WG requirement for compacted JSON-LD for AS2.0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants