Skip to content

Feature Detection in JSON-LD Processors #33

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
msporny opened this issue Jul 6, 2018 · 12 comments
Closed

Feature Detection in JSON-LD Processors #33

msporny opened this issue Jul 6, 2018 · 12 comments

Comments

@msporny
Copy link
Member

msporny commented Jul 6, 2018

If you look at the latest syntax specification, I count roughly 32 features slated to be included in JSON-LD 1.1:

https://json-ld.org/spec/latest/json-ld/#advanced-concepts

JSON-LD was always intended to be simple to understand and use. In fact, the appearance of simplicity (even though the processors are not simple) was a driving design goal. It's why the specification is written the way it is... we wanted everyday developers to be able to read it as the primary documentation. I think we started losing that a bit toward the end, but we have received multiple compliments from Web developers that knew nothing about RDF (and still don't know anything about RDF) on how simple it was to understand and use JSON-LD by just reading the specification.

I think we're going to lose that if we keep adding features. I don't think the argument is to not add new features, but rather to do so in a way that keeps the core JSON-LD 1.1 spec readable by regular Web developers.

I also think that there is a way to enable experimental features to be specified in CGs and then later pulled into "official W3C specifications" without getting into a brawl over whether the feature is useful or not prematurely.

So, here's a proposal:

  1. Give each JSON-LD feature a name. For example: aliasing, reverseProperties, typeCoercion, etc.
  2. Each JSON-LD version will officially support a set of these features. For example, JSON-LD 1.0 supports roughly ~20, JSON-LD 1.1 supports roughly ~30, and so on.
  3. Move some of these less-used features (based on real world data/usage) into a "Advanced JSON-LD Features" specification to keep the base specification simpler and more easily readable.
  4. Extend the @version keyword to take an array, where you can specify experimental extensions. For example: `"@Version": [1.1, "amazingExtensionFoo", "nicheExtensionBar"] - processors throw if they don't understand every extension listed.

Benefits of this proposal:

  • We simplify the prose and length of the JSON-LD Core Syntax specification.
  • We enable people to suggest and spec out new features without having to go through the JSON-LD WG gauntlet, which will inevitably kill innovation before it has a chance in the wild.

Drawbacks of this proposal:

  • Not thoroughly thought through idea - may not be possible at this stage... wish we had built this into JSON-LD 1.0.
  • Added technical complexity for implementers.
  • Potential explosion in extensions resulting in reduced interop.
@akuckartz
Copy link

Maybe something like the https://modernizr.com feature detection can be used to "detect" JSON-LD features?

@gkellogg
Copy link
Member

gkellogg commented Jul 6, 2018

IMHO, creating separate features, in most cases, is a reduces interoperability. There may be some narrow cases where the producer and consumer can agree on certain outside the norm features, but that's not how JSON-LD is typically used these days.

I think the 30-some "features" can be grouped into categories that make it look less complex. For example, several features have to do with using objects to map data (language-maps, data-maps, graph-maps, id-maps, type-maps). In reality, there is a single feature to allow data to be mapped, which is supported orthogonally across different types of things. The spec needs an update to put things into such groupings to create a more uniform flow to the document.

One thing we could consider would be to create a separate Dataset spec, and allow the main spec to focus on creating data in a single graph. That would reduce complexity for most users, and allow some of the newer features to be moved into that spec.

@iherman
Copy link
Member

iherman commented Jul 6, 2018

I like your last proposal, ie, separate dataset spec:-)

@ericprud
Copy link
Member

ericprud commented Jul 7, 2018

+1 to @gkellogg. Frequently, new standards have to low-ball the barrier to entry. If it seems like a lot of effort, folks won't bother, even if they will, in the long run, expend more effort reinventing the functionality. Once established, the rules change; potential users are already convinced that it's worth investing some tutorial time and you already have a committed implementer base. At that point, the cost of spotty implementation of features exceeds the cost of comprehensive implementation burden. I take as evidence, SOAP, WSDL, HTTP 1.1, every successive version of HTML. XML editions didn't add features but no one wanted them.

SOAP, 1.0 is a bit of a counter example in that SOAP encoding didn't have uptake and didn't make it into 1.1. On the other hand, there aren't a lot of proposed JSON-LD 1.1 features that could be implemented as a separate pre- or post-processing step.

I'd leave it to the editors to decide if Dataset could/should be a separate spec. It's kind of nice for readers to have a topic-oriented document or section, but not if it has to poke its fingers into much of the base spec.

@msporny
Copy link
Member Author

msporny commented Jul 8, 2018

IMHO, creating separate features, in most cases, reduces interoperability.

I agree that creating separate features with no plan on how they get integrated into the main language reduces interop.

However, having a plan on how you fold new experimental features into the core of a language mitigates that issue in the long-term. We should be thinking long-term.

For example, @akuckartz's example of Modernizr is exactly what I'm talking about. Can we get to a place where we enable folks to polyfill JSON-LD without breaking existing deployed JSON-LD with experimental features until it becomes clear that many people are using those features? Can we relegate those features to /outside/ of the core specification?

If we don't do that, I have a feeling that we're going to spend a non-trivial amount of time doing what most WGs do during a 2nd iteration of a spec - arguing over whether or not to add features to the core language and making a few bad calls on BLINK-tag like features.

We need a strategy to polyfill JSON-LD until the usefulness of features are backed by data instead of a very small group of intelligent and opinionated individuals deciding a features future usefulness based on a very small sample size. It's not how science should be done. :)

@dlongley
Copy link
Contributor

I like this idea of incremental feature support and transitioning features from "polyfills" to specs. I'm not sure on the specific details for how we should do this, but the proposed concept is essentially the same as what the major Web browsers do and it works for them.

@msporny msporny changed the title Modularization of JSON-LD features Feature Detection and Modularization of JSON-LD features Jul 20, 2018
@msporny msporny changed the title Feature Detection and Modularization of JSON-LD features Feature Detection in JSON-LD Processors Jul 20, 2018
@BigBlueHat
Copy link
Member

Thanks for bringing this up, Manu! I especially think this needs discussion as we talk about how soon one can "get at" features like "lists of lists" (see #41). Had that been there or optionally enable-able, ActivityStreams 2.0 would likely have used application/ld+json as it's media type. As it is, they minted a new one (application/activity+json) because they allow "raw" GeoJSON to be embedded (see https://www.w3.org/TR/activitystreams-core/#h-extensibility)

So, now with those in the wild, I'm curious how we could "upgrade" them (when processed) to keep their list-of-lists data in the output graph--without all the past published JSON having to be edited (i.e. just their @context documents refined/improved).

The details of how we trigger that is where things we'll get...fiddly.

I'm not keen on the @version array approach as it will essentially create a very wide range of "versions" of JSON-LD. My preference would be for features to be "sniffed" from the documents (data and context) and (if the processor supports) the output graphs become "richer."

Perhaps we could use the list-of-lists feature as a test case for different approaches to both signalling and processing?

@lanthaler
Copy link
Member

lanthaler commented Aug 20, 2018

We need a strategy to polyfill JSON-LD until the usefulness of features are backed by data instead of a very small group of intelligent and opinionated individuals deciding a features future usefulness based on a very small sample size. It's not how science should be done. :)

How would you envision that to work? With HTML etc. that works because browsers can leverage code-on-demand in the form of JavaScript. I don't think we should go down that route with a data interchange format that has to work across a plethora of languages, stacks, and contexts but unfortunately don't have an idea of how to make it work apart from providing a fall-back mechanism (an alternate URL?) for older processors to instead load a pre-processed document that they can understand at the expense of it perhaps consuming more bandwidth or being less idiomatic. Such fallback URLs could, of course, be web APIs whose sole purpose is to do such transformations.

@BigBlueHat
Copy link
Member

Maybe we have this upside down. What if instead of putting the version number in the inbound document, we put that "knowledge" on the outbound side--i.e. "this was processed as if it were a 1.1 document."

For instance, the list-of-lists situation is equally "valid" (i.e. expressible) in JSON-LD 1.0 and 1.1. However, the output post-processing is different (one creates a linear output list of stuff, the other a list containing lists which preserves what's visible in the JSON).

Who should care about that fact more? The person creating the JSON-LD (which, in this case is identical in every respect) or the person processing it (which is where things become "meaningful")?

@lanthaler
Copy link
Member

I'm not sure I follow what problem that would solve!?

On the receiving end users have full control over what processor they use (and they know which one they used). What they don't know - without additional information - is how the sender intended the message to be understood. The media type application/ld+json stipulates a contract that is defined by the 1.0 spec. Everything that breaks that contract needs to be communicated somehow to avoid "unintended misunderstanding" aka interoperability issues.

Unlike HTML which is typically written for human consumption (i.e., extremely smart processors 😄) and can leverage code on demand, we unfortunately can't rely on a "ignore everything unknown" in this context. We need to actively break 1.0 processors if they would interpret the data incorrectly as that could have serious negative side effects.

@BigBlueHat
Copy link
Member

@danbri we're discussing this tomorrow at TPAC (per our current schedule). Your input here would be helpful as we look toward "upgrading" existing JSON-LD.

@iherman
Copy link
Member

iherman commented Feb 9, 2019

This issue was discussed in a meeting.

  • RESOLVED: Close #33, wontfix. Extension mechanism is just to add features to the context that a processor does not understand. {: #resolution5 .resolution}
View the transcript 5.1. Feature Detection in JSON-LD Processors
Adam Soroka: #33
Gregg Kellogg: Close won’t fix for #33?
Rob Sanderson: +1 to close wontfix, due to lack of time and the extent of the new work
Gregg Kellogg: this would injure interoperability
Rob Sanderson: agreed
… and it’s a big ask to prescribe all the features
Ivan Herman: do we close it? or defer it?
David I. Lehn: This was a while ago
… we were coming up with lots of features
Gregg Kellogg: and then mediatypes have been used for just this
David Newbury: I would happily close
… this kind of version inspection– the complexity outweighs any benefit
… we want to put the burden on implementors, this does the opposite
David I. Lehn: one place this might help is with something like JSON literals,
Rob Sanderson: that goes right to the interop question
Gregg Kellogg: the reason we needed @Version is to make a 1.0 processor die because it would not check the range of various keys
… which we’ve tightened up in 1.1.
… we used to leave that open
… so adding something more specific to @Version would be gratuitous, in that sense
Ivan Herman: why would this help the user?
… I don’t care about the devs– they will manage
… but this will complicate life for the users!
… I don’t see who would gaim
Proposed resolution: Close #33, wontfix. Extension mechanism is just to add features to the context that a processor does not understand. (Rob Sanderson)
Ivan Herman: +1
Gregg Kellogg: +1
David Newbury: +1
Jeff Mixter: +1
Rob Sanderson: +1
Simon Steyskal: +1
Harold Solbrig: +1
ajs6f>ajs6f: has joined #json-ld
ajs6f>: <ajs6f>+1
Resolution #5: Close #33, wontfix. Extension mechanism is just to add features to the context that a processor does not understand. {: #resolution5 .resolution}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

9 participants