-
Notifications
You must be signed in to change notification settings - Fork 157
Data round tripping - Sandro's review #237
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
So the idea here is that going RDF->JSON-LD->RDF we want to get back an isomorphic graph? (That is, the same triples, but with blank nodes replaced in a consistent way.) I think people might want (or accept )some rewriting of literals that doesn't change the value, like "01"^xs:int being rewritten as "001"^^xs:int or "1"xs:int. In the SPARQL world that kind of rewriting is allowed, in parsing of RDF/XML or Turtle this is not allowed. (I'm pretty sure SPARQL also lets you convert "1"xs:int to "1"xs:integer, and such.) I don't really care which we decide on this, but there should be a decision and test cases, so people know what to expect. So can literals be rewritten in RDF -> JSON-LD -> RDF or not? If they can't, then the RDF->JSON and JSON-RDF algorithms have to line up. As you have it now JSON-RDF MUST produce canonical literals; that's fine. But we have to also say in RDF->JSON that only literals already in canonical form are to be output as native types; the rest MUST be left as strings in type objects. If literals CAN be rewritten, then it seems to me both algorithms are free to ignore canonical form. Right? Why would we require canonical literals in JSON->RDF? Of course, canonical form is kind of a friendly/nice, it doesn't actually help with round tripping. (The issue of handling things like 1.9999999999999999999999E0 is kind of trickier and less important, so let's settle the above one first.) |
Yes, that's correct. We already have test cases for this:
and a combined one for the reverse direction: fromRdf-0002-in.nq -> fromRdf-0002-out.jsonld.
Yes.
We produce canonical literals to simplify testing. Why should we convert just literals in canonical form to native types? To simplify things? Would be OK to me. |
I don't think simplifying testing merits a MUST..... Or, if it does, then say that, instead of saying it's because of round-tripping....
Yeah -- I don't have much opinion on this, as long as the story makes sense. Seems like something RDF WG folks might care about, though -- whether round-tripping an RDF graph through JSON preserves the non-canonical forms of literals (eg number of leading zeros on an integer). We might ask at the same time whether people care about values like 1.99999999999999999999999E0 being preserved through such round-tripping. Do you want to ask or shall I? |
RESOLVED: Specify what canonical lexical form is for xsd:integer and xsd:double by referencing the XML Schema 1.1 Datatypes specification. When processors are generating output, they are required to use this form. |
And....? That doesn't address my questions about round tripping. |
also, not to nitpick, but "processor" is a poor choice of name for things implementing the API -- as evidenced by the fact that the resolution uses that word when it should have used the term Implementation. Right? |
Sorry, as usually I just pasted the resolution we made during the telecon. We had a quite long discussion about this. The majority of the group thought it is required to achieve interoperability. I personally think that it is not required for the conversion of JSON-LD to RDF (the abstract syntax) to require canonical lexical form but wasn't able to argument it properly. We also had problems finding some guidance in RDF Concepts. All we've found was literal equality (http://www.w3.org/TR/rdf11-concepts/#dfn-literal-equality) which requires literals to match character by character. Perhaps we should discuss this briefly in tomorrow's RDF WG telecon. |
Yes, you are completely right. What about changing Implementation to "JSON-LD 1.0 Processor" and Processor to "JSON-LD 1.0 API Implementation" (a bit clunky but definitely clearer I think)? |
Yes, I agree those terms are better. And yes, let's talk about roundtripping with the WG. |
@sandhawke As we were discussion the need to specify lexical form, it seemed that RDF Concepts defines a literal as having a lexical form for value, datattype and language, and that two literals are equivalent only if the all compare equal. For the specific case of JSON numbers, it is really impossible to use the original lexical representation, as this is not maintained when a JSON document is processed. Therefore, when generating an RDF Literal, which must have a lexical representation for the native values that a processor is working with. For interoperability, it seems that we must specify a format for this. This issue of round-tripping RDF->JSON-LD->RDF is somewhat different. The algorithm does involve a conversion to native form, but by re-using previous language, this can be controlled by an option, so that xsd:integer and xsd:double values can maintain a string representation, which eliminates such round-tripping issues. |
When talking about algorithms, I think that "processor" is a reasonable term, and consistent at least with how RDFa refers to them. Whe talking about API, I think that talking about implementations is appropriate. |
... and clarify parts of the relevant algorithms. @sandhawke, could you please have a look at the new section and tell me whether it's clearer or if it still needs some love. Thanks. This addresses #237.
…mentation See #237 (comment) This addresses #237. /cc @msporny @dlongley @gkellogg @niklasl @sandhawke
The data round-tripping section [1] has been improved considerably. Sandro already indicated that the updates address his concerns [2]. I will thus close this issue in 24 hours unless I hear objections. [1] http://json-ld.org/spec/latest/json-ld-api/#data-round-tripping |
_This has been raised in #234 by @sandhawke. To simplify discussions, I've created this separate issue for it. Below is Sandro's original mail and my reply._
Not sure I understand the difference!?
We are just rephrasing it. Since this spec is addressing JSON developers we wanted to avoid that they have to read the XML spec. What do others think about this?
I would say so. It's just an example how this could be done in one specific programming language.
It is there to ensure that the result is deterministic and testing is simplified (you can verify the result using simple string comparison).
Considering that, do you think we need to change something?
Is the word "normalized" confusing you? That's probably a left over from the normalization algorithm. What we are trying to say here is: if you have decimal values (e.g. money) you shouldn't use JSON number or a xsd:double but a string. Maybe we can just drop this sentence!?
Yes, we mean exactly that. You should use strings instead. In most cases this won't matter and consequently I don't think the MUST you propose makes much sense. JSON developers want numbers and not strings. Just out of curiosity, isn't the same true in Turtle for instance?
The text was updated successfully, but these errors were encountered: