JSON-LD API Review by Robin Berjon #200

msporny · 2012-11-23T16:08:01Z

A review of the JSON-LD API by @darobin:

• Probably not something you can do anything about, but you folks must be the last people on Earth still talking about IRIs.

• If you add the "highlight" class to your code examples, they get some syntax highlighting.

• "JSON Object". I find this definition confusing. It's unclear whether you're talking about something in memory or the syntax. If the latter, you should just refer to RFC4627. If the former, then refer to WebIDL (or define it in WebIDL). Same all the way up to "null".

• You should put parentheses around your union types (everywhere).

• [NoInterfaceObject] is usually a red flag. In the case of JsonLdProcessor I'm not sure what to advise because it's unclear to me how this is expected to be exposed. The examples just reference a "jsonld" object but don't say where it's coming from. How about you drop the [NIO] and instead add a [Constructor]? That way you can "new JsonLdProcessor" and also do feature detection to know if it's supported in a context.

• You shouldn't just in prose say that it's okay to not use the asynchronous approach. If you really want a synchronous interface, then you should define it properly in it own section (as something that JS contexts would only support in e.g. Workers). But really, an asynchronous interface is probably the way to go.

• You list things that look like error constants (INVALID_SYNTAX) but they don't appear to be defined anywhere. I'm guessing that you probably want to have an enum somewhere for those. You don't have to go all uppercase on our arses, too "invalid syntax" is nice and simple!

• The error cases should be linked to specific algorithmic steps that will cause them to be raised.

• Should it really be an error if the compaction would lose some information? As a developer, I would much rather the algorithm compact as much as possible, but no further. So I wouldn't lose information, but I also wouldn't get an error. The result might be less compact, but IMHO more useful.

• You might want to be clearer about what you mean by copying the input. Make sure implementers don't modify the structure in place.

• WebIDL does not have "string" and "number". Use DOMString and one of the numeric types for that (I'm guessing double?).

• [NoInterfaceObject Callback] is gone, you should use the WebIDL callback construct instead. Loving the Node-style callbacks.

• I this toRDF() is probably designed wrong. Why are you sending one callback per Quad? Wouldn't it be simpler to callback with Quad[]? Either way, if there is a good reason to have multiple callbacks then you also need a way to signal that all the data has been processed — otherwise I can never know how many to expect and do whatever work I need to do next.

• Is ConformanceCallback really needed? It seems overly complicated for no good reason. If it's recoverable, then recover. If people want to write linting tools, then let them. But a processor shouldn't have to deal with this, and a developer against the API should never have to care. It looks like useless old school pedantry.

• You also don't say what happens if I don't call the passed callback back.

• Why have a type for IRI? It's just a bloody string! Have the relevant bits accept DOMString, and define what happens when they're not valid.

• Don't use "function" as a type, define a callback type.

• Seriously, you might want to state somewhere at the beginning of the document that "Throughout this specification, the term 'URL' refers to a IRI as used in RDF." There are maybe five people who know what an IRI is, and half of them think it's a silly idea. No developer will have a clue.

• If Quad is just meant to convey some data, it should be a dictionary. If you really want it to be an object, then it should expose an interface through a Constructor. In the latter case it should probably take the four fields as params. But really, this looks like a dictionary to me. I would just want to pass { subject: "...", ... } and not have to construct anything.

• Why define a Node? It serves no purpose whatsoever. Drop it!

• You've now defined an IRI object twice! Don't. This is a string.

• Same for BlankNode!

• Literal clearly seems to be a dictionary.

• Ah I get it now, you're using the object type to know what it is you're dealing with in the Quad. That's actually bad (but in a different way), it won't work at all.

First, if you're on the receiving end of a Quad, I'm presuming you're expecting people to go "if (q.subject instanceof IRI)..." Except you've defined the "IRI" interface to be NoInterfaceObject so there is actually nothing to put on the right hand side of that instanceof. Second, Even if you remove the NoInterfaceObject, it's still problematic. If I pass the Quad to another frame because I'm delegating processing, the interface object will be different and so the comparison fails (yay!). And we probably don't really want to have global "IRI" objects (and we already have Node).

I think you're better off making each of those a dictionary with keys type and value (except for Literal, that gets type and its three fields). That's easier to test for, safer, and eliminates the interfaces.

Perhaps better still: do you really need this RDF stuff? Isn't it more the job of a library to output some given syntax based on processing one of the normalised forms?

darobin · 2012-11-23T16:14:35Z

FWIW I'm watching this thread so if there are questions don't hesitate to post them here as I'll get them.

cygri · 2012-11-23T17:20:14Z

FWIW, I think that IRIs and HTML-style URLs are equivalent in UTF-8 encoded documents.

I'm actually glad to hear that the Node class thingy doesn't work, as it seemed a bit like overkill. A dictionary sounds simpler. You could probably re-use a lot of the existing spec machinery by saying that its keys are @id, @value, @type and @lang. Except that the @ is inconvenient as it makes things like if (myquad.object.lang && myquad.object.lang == 'en') harder to write.

This addresses #200

lanthaler · 2012-11-27T15:04:57Z

• Probably not something you can do anything about, but you folks must be the last people on Earth still talking about IRIs.

What would you suggest instead? URL?

• If you add the "highlight" class to your code examples, they get some syntax highlighting.

• "JSON Object". I find this definition confusing. It's unclear whether you're talking about something in memory or the syntax. If the latter, you should just refer to RFC4627. If the former, then refer to WebIDL (or define it in WebIDL). Same all the way up to "null".

You are right, this is unclear in the API spec. We did a better job in the syntax spec: http://json-ld.org/spec/latest/json-ld-syntax/#terminology. The reason we call it "JSON object" is to avoid confusion with the object in a triple.

_Edit:_ Improved in fb6cc19

• You should put parentheses around your union types (everywhere).

So object or object[] or IRI input should be (object or object[] or IRI) input?

_Edit:_ Fixed in 1120c85. I didn't do it for dictionaries and the callback definition as that seems to break ReSpec.

• [NoInterfaceObject] is usually a red flag. In the case of JsonLdProcessor I'm not sure what to advise because it's unclear to me how this is expected to be exposed. The examples just reference a "jsonld" object but don't say where it's coming from. How about you drop the [NIO] and instead add a [Constructor]? That way you can "new JsonLdProcessor" and also do feature detection to know if it's supported in a context.

I changed it in e04560a

• You shouldn't just in prose say that it's okay to not use the asynchronous approach. If you really want a synchronous interface, then you should define it properly in it own section (as something that JS contexts would only support in e.g. Workers). But really, an asynchronous interface is probably the way to go.

That prose is there since we have implementations in Ruby, PHP, and a few other languages which generally use synchronous interfaces. Would you nevertheless define it explicitely?

• You list things that look like error constants (INVALID_SYNTAX) but they don't appear to be defined anywhere. I'm guessing that you probably want to have an enum somewhere for those. You don't have to go all uppercase on our arses, too "invalid syntax" is nice and simple!
• The error cases should be linked to specific algorithmic steps that will cause them to be raised.

That's true, the whole error handling is not properly specified at the moment. This is being tracked in #153.

• Should it really be an error if the compaction would lose some information? As a developer, I would much rather the algorithm compact as much as possible, but no further. So I wouldn't lose information, but I also wouldn't get an error. The result might be less compact, but IMHO more useful.

I agree completely with you.

• You might want to be clearer about what you mean by copying the input. Make sure implementers don't modify the structure in place.

OK. What is meant is to copy the input so that it can then be modified in place by the algorithms

• WebIDL does not have "string" and "number". Use DOMString and one of the numeric types for that (I'm guessing double?).

Are you talking about JsonLdProcessingError? That is being discussed as part of #153 as well.
_Update:_ Fixed in 1120c85 and e8f0cfc

• [NoInterfaceObject Callback] is gone, you should use the WebIDL callback construct instead. Loving the Node-style callbacks.

Do you know how that's done with ReSpec? I couldn't find it.
_Update:_ Fixed in 261c94b

• I this toRDF() is probably designed wrong. Why are you sending one callback per Quad? Wouldn't it be simpler to callback with Quad[]? Either way, if there is a good reason to have multiple callbacks then you also need a way to signal that all the data has been processed — otherwise I can never know how many to expect and do whatever work I need to do next.

+1, edit: changed in 077d498
_Update:_ and removed completely in 261c94b :-)

• Is ConformanceCallback really needed? It seems overly complicated for no good reason. If it's recoverable, then recover. If people want to write linting tools, then let them. But a processor shouldn't have to deal with this, and a developer against the API should never have to care. It looks like useless old school pedantry.

• You also don't say what happens if I don't call the passed callback back.

I also think it's way too complex. Again, see issue #153.
_Update:_ Removed in 7a8e01e.

• Why have a type for IRI? It's just a bloody string! Have the relevant bits accept DOMString, and define what happens when they're not valid.

Good point
_Update:_ Fixed in 1120c85

• Don't use "function" as a type, define a callback type.

_Update:_ Fixed in 261c94b

• Seriously, you might want to state somewhere at the beginning of the document that "Throughout this specification, the term 'URL' refers to a IRI as used in RDF." There are maybe five people who know what an IRI is, and half of them think it's a silly idea. No developer will have a clue.

Would be fine for me.. Don't know what others think of this.

• If Quad is just meant to convey some data, it should be a dictionary. If you really want it to be an object, then it should expose an interface through a Constructor. In the latter case it should probably take the four fields as params. But really, this looks like a dictionary to me. I would just want to pass { subject: "...", ... } and not have to construct anything.

• Why define a Node? It serves no purpose whatsoever. Drop it!

• You've now defined an IRI object twice! Don't. This is a string.

• Same for BlankNode!

• Literal clearly seems to be a dictionary.

• Ah I get it now, you're using the object type to know what it is you're dealing with in the Quad. That's actually bad (but in a different way), it won't work at all.

First, if you're on the receiving end of a Quad, I'm presuming you're expecting people to go "if (q.subject instanceof IRI)..." Except you've defined the "IRI" interface to be NoInterfaceObject so there is actually nothing to put on the right hand side of that instanceof. Second, Even if you remove the NoInterfaceObject, it's still problematic. If I pass the Quad to another frame because I'm delegating processing, the interface object will be different and so the comparison fails (yay!). And we probably don't really want to have global "IRI" objects (and we already have Node).

I think you're better off making each of those a dictionary with keys type and value (except for Literal, that gets type and its three fields). That's easier to test for, safer, and eliminates the interfaces.

I think this whole stuff came from a never finished RDF API, @gkellogg knows more about it.

_Update:_ All of this has been removed in 261c94b

Perhaps better still: do you really need this RDF stuff? Isn't it more the job of a library to output some given syntax based on processing one of the normalised forms?

Very good question :-)

_Update:_ All of this has been removed in 261c94b

This addresses #200.

lanthaler · 2012-11-27T16:31:01Z

RESOLVED: The callback signature for the .toRDF() method should accept Quad[]. That is, the callback is called once after all processing has been completed.

The toRDF() algorithm hasn't been updated yet to call the callback. This addresses #200.

lanthaler · 2012-12-11T15:59:28Z

RESOLVED: Remove the .toRDF() and .fromRDF() WebIDL API calls into a separate document that will not be a part of the JSON-LD 1.0 work. The to/from RDF algorithms will still be a part of the JSON-LD API 1.0 work.

niklasl · 2012-12-11T16:10:17Z

When continuing work on toRDF/fromRDF (in this separate document), I agree on using just dictionaries to represent RDF nodes. I suggest to base these on the JSON representation for SPARQL variable bindings. (Though perhaps changing xml:lang to just lang...)

Cleanup references within RDF algorithms to not require RDf WebIDL definitions. This addresses issue #200.

This addresses #153 and #200.

This addresses #200.

I didn't put parentheses around the union types in interfaces and the callback as that seems to break ReSpec. This addresses #200.

lanthaler · 2012-12-14T17:08:35Z

I've addressed most of the issues outlined above. I created a separate issues for the remaining two, namely #205 and #206. Error handling is being discussed in #153.

As per our usual procedure I will therefore close this issue in 24 hours unless I hear objections.

This addresses #153 and #200

…eters This addresses #200.

@darobin

... as suggested by @darobin in #200.

lanthaler added a commit that referenced this issue Nov 27, 2012

Replace NoInterfaceObject with constructor for JsonLdProcessor

e04560a

This addresses #200

lanthaler added a commit that referenced this issue Nov 27, 2012

Remove LOSSY_COMPACTION error constant

01f3841

This addresses #200.

lanthaler added a commit that referenced this issue Nov 27, 2012

Update QuadCallback to use array of Quads instead of just a single quad

077d498

The toRDF() algorithm hasn't been updated yet to call the callback. This addresses #200.

gkellogg added a commit that referenced this issue Dec 13, 2012

Move RDF API methods to new json-ld-rdf document.

e8f0cfc

Cleanup references within RDF algorithms to not require RDf WebIDL definitions. This addresses issue #200.

lanthaler added a commit that referenced this issue Dec 14, 2012

Remove conformance callback

7a8e01e

This addresses #153 and #200.

lanthaler added a commit that referenced this issue Dec 14, 2012

Use WebIDL's new callback construct for JsonLdCallback

261c94b

This addresses #200.

lanthaler added a commit that referenced this issue Dec 14, 2012

Fixed all WebIDL datatypes and wrapped union types in parentheses

1120c85

I didn't put parentheses around the union types in interfaces and the callback as that seems to break ReSpec. This addresses #200.

This was referenced Dec 14, 2012

Use the term URL instead of IRI in the (API) spec #205

Closed

Clarify that the algorithms operate on a copy of the input #206

Closed

lanthaler added a commit that referenced this issue Dec 14, 2012

Collect error codes in an enum

9efe211

This addresses #153 and #200

lanthaler closed this as completed Dec 16, 2012

lanthaler added a commit that referenced this issue Dec 18, 2012

Add statement that JSON-LD processors must not modify the input param…

eb12eee

…eters This addresses #200.

lanthaler added a commit that referenced this issue Jan 8, 2013

Enumerate all error codes and changed them to simple strings

9d9adbd

... as suggested by @darobin in #200.

This was referenced Mar 29, 2013

Sandro Hawke's JSON-LD 1.0 Processing Algorithms and API spec review #234

Closed

Zhe Wu's JSON-LD 1.0 Processing Algorithms and API spec review #236

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

JSON-LD API Review by Robin Berjon #200

JSON-LD API Review by Robin Berjon #200

msporny commented Nov 23, 2012

darobin commented Nov 23, 2012

cygri commented Nov 23, 2012

lanthaler commented Nov 27, 2012

lanthaler commented Nov 27, 2012

lanthaler commented Dec 11, 2012

niklasl commented Dec 11, 2012

lanthaler commented Dec 14, 2012

JSON-LD API Review by Robin Berjon #200

JSON-LD API Review by Robin Berjon #200

Comments

msporny commented Nov 23, 2012

darobin commented Nov 23, 2012

cygri commented Nov 23, 2012

lanthaler commented Nov 27, 2012

lanthaler commented Nov 27, 2012

lanthaler commented Dec 11, 2012

niklasl commented Dec 11, 2012

lanthaler commented Dec 14, 2012