Skip to content

What if a @context URL response is HTML? #66

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
BigBlueHat opened this issue Feb 25, 2019 · 18 comments
Closed

What if a @context URL response is HTML? #66

BigBlueHat opened this issue Feb 25, 2019 · 18 comments

Comments

@BigBlueHat
Copy link
Member

See #53.

The HTML-based response has the value of potentially providing documentation for the contained context, but we lack a few things and/or need to clarify others.

See @gkellogg's suggestions at #53 (comment)

@azaroth42
Copy link
Contributor

Then it's an error.

We dereference the remote context and replace context with the value of the @context member of the top-level object in the retrieved JSON-LD document. If there's no such member, an invalid remote context has been detected.

From https://www.w3.org/TR/json-ld11-api/#overview

@gkellogg
Copy link
Member

gkellogg commented Mar 1, 2019

It’s currently an error, but we can update to handle this case.

@azaroth42
Copy link
Contributor

Only if there's a previous context that establishes a 1.1 processing mode, no?

@gkellogg
Copy link
Member

gkellogg commented Mar 1, 2019

Maybe, there’s not really a danger of a 1.00 processor misinterpreting this, which is the purpose of the @version announcement, see we might allow this as 1.1 behavior. It’s really only when there’s a worry that different versions will process the same document differently that it matters.

@gkellogg
Copy link
Member

This requires some syntax changes as well, and we should allow frames to be in HTML too.

@BigBlueHat
Copy link
Member Author

While I like the opportunities this could open up, I remain a bit concerned that the JSON-LD API itself now requires HTML processing vs. having a separate process for extracting JSON-LD and then processing it.

Practically, it means that shipping fully conformant JSON-LD API code requires far more dependencies than it used to. I'd hoped that we could modularize it a bit more such that these things could be layered together when needed.

Would it make sense to move this HTML stuff to an "extraction" process, which is then fed to expansion/compaction/framing, etc?

@gkellogg
Copy link
Member

Other than for context processing, it is handled in the WebIDL, with a section on extracting JSON-LD from HTML. We could probably move more algorithmic bits into a separate section and remove some steps from the context area, but there are some subtle differences that might make that more complex.

Once we decided to deal with JSON-LD in HTML normatively, they dye was cast on the need to include HTML tools in a JSON-LD processor.

@iherman
Copy link
Member

iherman commented Apr 26, 2019

Once we decided to deal with JSON-LD in HTML normatively, they die was cast on the need to include HTML tools in a JSON-LD processor.

Indeed.

@BigBlueHat
Copy link
Member Author

BigBlueHat commented Apr 26, 2019

Once we decided to deal with JSON-LD in HTML normatively, they die was cast on the need to include HTML tools in a JSON-LD processor.

Maybe... 😃 Defining in the syntax document how one can normatively convey JSON-LD in HTML doesn't implicitly require that JSON-LD's core API be the thing to deal with it. For instance, you can convey HTML inside JSON, but that/hasn't meant that HTML parsers should find, extract, and display those values when present (see also MIME, etc).

My concern (when we first discussed this and now) is that we've drastically raised the bar for API implementations when API changes were not in fact required to make the embedding cases normative from a syntax perspective.

It feels like once embedding in HTML became normatively defined the scope quickly crept to making HTML documents potential first-class JSON-LD documents...and that's where the scary parts come in...

At the API level my expectation was/is that there would be an extract() function that could be (optionally) run if/as needed and if/as wanted by various implementations or deployment configurations.

Keeping the bar low and/or layered for processing JSON-LD is important for its future success in areas like Web of Things, credentials, etc.

Is there a way (presently) that we can reconsider the API level concerns around extracting JSON-LD from HTML?

@dlongley
Copy link
Contributor

I think @BigBlueHat is absolutely right here. We need to keep things layered in a sensible way.

@gkellogg
Copy link
Member

As I said, we can further extract some of the bits into the HTML Content Algorithms section and do something like you expect with an extract() call. But, I think the issue isn't so much about how distributed the text is, but how normative the requirement is.

Perhaps adding a preface that describes a "JSON-LD in HTML" conformance level and reinforcing that where the algorithm is invoked, in the HTML Content Algorithm sections, and moving the relevant tests out of expand/compact manifests and into separate html manifest would satisfy your concern?

The changes to the syntax (and framing) sections should be a bit simpler.

@iherman
Copy link
Member

iherman commented Apr 27, 2019

Along the lines of what @Gellogg was referring to: we had a somewhat similar problem in RDFa as for what types of capabilities an RDFa processor may have, see, e.g., RDFa Core Accessing the Processor Graph. That section also includes several classes of RDFa Processors.

Translating this to our context, this means that we may introduce 'classes' of JSON-LD processors:

  • The whole deal (I believe this should be the default)
  • No HTML content (but I believe this should mean no processing of embedded JSON-LD in general, not separating a @context)
  • Streaming processor (although this may be only informative, because we will not yet provide a 100% foolproof specification on what that means)

I am not sure it is necessary to have a standard way of declaring the processor class or just to leave it to the processor.

I would also think that a 'whole deal' should be the default case, ie, unless a processor clearly declares to be a no-HTML content then it should be such a default.

@iherman
Copy link
Member

iherman commented Apr 27, 2019

This issue was discussed in a meeting.

  • No actions or resolutions
View the transcript ob Sanderson: TOPIC Issue: context response as HTML
Rob Sanderson: link: #66
Rob Sanderson: what happens if you deref. a context and you get back HTML
… currently it’s an error
… however we opened the door of dealing with HTML within json-ld
… is it justified though?
Gregg Kellogg: we did look to this as part of our solution on how to document json-ld
… I created an example as part of one of the pull requests
… what does this mean in terms of processing
… you get HTML turn that into JSON and pass that to the processor
… where that didn’t work was with contexts
… related to WebIDL section of the api spec that discusses framing (?)
David Newbury: workergnome_ has joined #json-ld
Rob Sanderson: having a context that self-documents with HTML seems like a potential benefit
… if you don’t want to have HTML then just use content negotiation to request json only
Benjamin Young: web of things/credentials/etc. more than likely won’t ship with dom parsers
… if there’s a way of making html parsing more modular
… or at least not making it a requirement, would be preferred
Dave Longley: +1 to everything bigbluehat just said
Ivan Herman: if I don’t care about the implementation, and that I can use json-ld with HTML
… user’s might be confused when they can actually use json-ld+html
Benjamin Young: if there’s any req. that a context stays the same (e.g. for hashing or what not) than you would properly not want to have it in HTML
… verifiable claims would properly freak out if they have to include a dom parser too
… if there’s a way to achieve this in the api spec, then +1!
Dave Longley: perhaps we can have a conformance class around document loaders and push everything that way—and say that you have to use a document loader that understands JSON-LD in HTML if you want to support that, and then we don’t need two specs, just more conformance classes around document loaders (maybe)
Gregg Kellogg: wondering whether we can actually pull the HTML part out into an own spec, or define a profile for it

@BigBlueHat
Copy link
Member Author

BigBlueHat commented Apr 29, 2019

Sorry! Someday they'll move that button... 😖

@dlongley
Copy link
Contributor

Perhaps adding a preface that describes a "JSON-LD in HTML" conformance level and reinforcing that where the algorithm is invoked, in the HTML Content Algorithm sections, and moving the relevant tests out of expand/compact manifests and into separate html manifest would satisfy your concern?

This sounds like the right direction to me.

@dlongley
Copy link
Contributor

@iherman,

...unless a processor clearly declares to be a no-HTML content then it should be such a default.

I suspect jsonld.js core will not have any HTML support. Our library is written such that an HTML document loader could be written and used in conjunction with the library, however.

@gkellogg
Copy link
Member

@BigBlueHat @dlongley @iherman I added a section on processor levels and added text around parts of the algorithm which manipulate HTML to use them. (Also added similar text to w3c/json-ld-syntax#167). See if this satisfies your concerns. If so, I can split out tests into a separate context.

@azaroth42
Copy link
Contributor

Closing, done. The levels discussion has happened in the syntax issue.

gkellogg added a commit that referenced this issue Aug 13, 2019
gkellogg added a commit to w3c/json-ld-syntax that referenced this issue Aug 13, 2019
gkellogg added a commit to w3c/json-ld-syntax that referenced this issue Aug 13, 2019
gkellogg added a commit that referenced this issue Aug 16, 2019
gkellogg added a commit to w3c/json-ld-syntax that referenced this issue Aug 16, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants
@BigBlueHat @gkellogg @dlongley @iherman @azaroth42 and others