Issue 66 html context #83

gkellogg · 2019-04-25T00:35:55Z

Allow contexts to be HTML documents, with preference towards script elements of type application/ld+json;profile=http://www.w3.org/ns/json-ld#context.

For #66.

Preview | Diff

…lements of type `application/ld+json;profile=http://www.w3.org/ns/json-ld#context`. For w3c/json-ld-syntax#66.

gkellogg · 2019-04-25T00:37:08Z

Note that I didn't add compact tests, as the compact tests take the extracted context, and not a URL. All the important paths are checked here.

We'll want to add text to allow frame documents to be HTML too, for the same reasons of documentation.

rubensworks · 2019-04-25T06:39:08Z

Is there a particular reason to only take the first context that was found in the HTML?
If I understand the PR correctly,
a processor would only extract the context from the first script tag in the following:

<html>
<head>
  <script id="context" type="application/ld+json;profile=http://www.w3.org/ns/json-ld#context">
    {
      "@context": {
        "term3": "http://example.com/term3",
        "term4": "http://example.com/term4",
        "term5": "http://example.com/term5"
      }
    }
  </script>
  <script id="context" type="application/ld+json;profile=http://www.w3.org/ns/json-ld#context">
    {
      "@context": {
        "term4": "http://example.com/term4",
        "term5": "http://example.com/term5"
      }
    }
  </script>
</head>
</html>

In situations where Web designers are generating their views from multiple templates,
and different contexts may be contained in different templates,
it may make sense to allow multiple contexts in HTML.

Semantically, this could be processed similarly as context arrays.

gkellogg · 2019-04-25T14:37:16Z

It is certainly something to consider, but if we fall back to using the profile-less type, I don’t see us taking them all, just the first, so it might seem inconsistent. People should 👍 or 👎 your suggestion.

gkellogg · 2019-04-25T21:10:32Z

I tried to implement this, and ran into problems. One problem is that several tests fail, which previously passed, as normative text requires that a remote context be a top-level object, not array.

From Context Processing Algorithm step 3.2.4:

If the dereferenced document has no top-level dictionary with an @context member, an invalid remote context has been detected and processing is aborted; otherwise, set context to the value of that member.

I don't think it's worth softening or confusing this text without a compelling use case.

dlongley · 2019-04-29T16:24:46Z

I can see this approach being fraught with problems, mostly around dynamically generated script tags. Many (most?) people don't build static sites anymore -- which is why the search engines have adapted to run JS to do things like read JSON-LD on the page. Are we going to require the same of JSON-LD processors?

If not, we're relying on a pretty specific scenario under which this will work: A website that serves a static page or at least one where the first script tag with an @context is static. Is the assumption that the people using this feature will do this? Is this a safe/fair assumption?

It seems to me like this might be a feature that's too targeted for the Web of yesterday ... or one that requires much more than the (already heavy) HTML parsing lift to function in a way users expect.

gkellogg · 2019-04-29T17:08:24Z

The stated purpose of using HTML files for context (and frame) is to be able to document those files, in which case the JSON-LD script elements can be statically generated (perhaps the HTML is dynamically generated from the JSON-LD).

Otherwise, if the JSON-LD is dynamically generated and inserted into a script tag, you may as well content-negotiate to generate the pure JSON-LD file for a direct response.

We certainly can't anticipate every new web technology that may come about, but we can recommend practices for working for JSON-LD in web pages, particularly when the documentatative aspect is primary.

…or a full processor (or not a pure JSON Processor).

gkellogg · 2019-04-30T01:20:52Z

@BigBlueHat @dlongley @iherman I added a section on processor levels and added text around parts of the algorithm which manipulate HTML to use them. (Also added similar text to w3c/json-ld-syntax#167). See if this satisfies your concerns. If so, I can split out tests into a separate context.

rubensworks · 2019-04-30T06:52:08Z

Many (most?) people don't build static sites anymore

Static websites are still very much a thing. Many highly popular tools exist for this (Jekyll (used by GitHub pages), Gastby, Mkdocs, ...), so we should definitely consider this an active domain.

A website that serves a static page or at least one where the first script tag with an @context is static. Is the assumption that the people using this feature will do this? Is this a safe/fair assumption?

In any case, I don't have a strong preference to allow multiple context, but I could imagine that people writing webpages expect this to be possible. For example, CSS allows multiple stylesheets to be referenced, where each next one builds upon the previous one. So logically, a developer may assume that JSON-LD (contexts) work in the same way.

That being said, defining JSON-LD contexts in HTML is a very specific use-case, so I don't think multiple contexts in HTML are a must-have feature, it just would be nice-to-have for developers.

iherman · 2019-04-30T07:08:31Z

@BigBlueHat @dlongley @iherman I added a section on processor levels and added text around parts of the algorithm which manipulate HTML to use them. (Also added similar text to w3c/json-ld-syntax#167). See if this satisfies your concerns. If so, I can split out tests into a separate context.

I am basically fine with this. The minor thing that bothers me is that the definitions for pure processor & Co. are in a non-normative sections, whereas the reference to these terms are all in normative sections. Maybe these terms should become normative?

index.html

pchampin · 2019-04-30T07:41:32Z

+1 to @iherman about making "pure JSON processor" and "full processor" normative. However, "event-based JSON processor" should probably not be, because:

we decided during the last call not to fully specifiy it,
the current text states that only pure JSON processors may reject HTML documents, so it implicitly requires event-based processor to be able to parse HTML. Not all streaming implementations might want this constraint.

pchampin · 2019-04-30T07:55:42Z

Thinking more about it, I think that mixing the "pure JSON vs. full" distinction with "event-based" is confusing... especially if two out of three are to be defined normatively. "Event based" is not a "level", it is somewhat orthogonal to the others.

I propose to

make the "processor levels" section normative (can we mark as normative a subsection of the non-normative Section 1?), possibly by moving it in Section 3: Conformance;
move the description of "event-based provessor" somewhere else – although I don't have a clear idea where...

gkellogg · 2019-04-30T15:26:00Z

I can certainly make the section normative, which will require moving it someplace else. Whether or not the term “event-based JSON Processor” is, itself normative, it’s not referenced from any other normative statements, but the text can elaborate that the term is non-normative, or I can add a non-normative subsection to describe it.

…xt/html and the processor is a pure JSON processor.

gkellogg added 2 commits April 24, 2019 16:26

Use [[HTML]] reference, not [[HTML52]].

5e18aff

Allow contexts to be HTML documents, with preference towards script e…

a7f68d9

…lements of type `application/ld+json;profile=http://www.w3.org/ns/json-ld#context`. For w3c/json-ld-syntax#66.

gkellogg requested review from davidlehn and dlongley April 25, 2019 00:35

Add text to define processor levels and only invoke HTML processing f…

5670feb

…or a full processor (or not a pure JSON Processor).

pchampin reviewed Apr 30, 2019

View reviewed changes

index.html Outdated Show resolved Hide resolved

fixed a few typos

edf83f3

gkellogg and others added 5 commits April 30, 2019 14:13

Make processor levels normative.

f25798d

Correct text which can raise an error when retrieving a context as te…

ee6357e

…xt/html and the processor is a pure JSON processor.

Clarify the definition of an event-based JSON Processor.

11553b1

Fix some HTML markup issues.

2c9efe3

the 'processor levels' sections is normative

ccc0ccf

gkellogg merged commit 8cf5a5a into master May 1, 2019

gkellogg deleted the issue-66-html-context branch May 1, 2019 16:55

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Issue 66 html context #83

Issue 66 html context #83

gkellogg commented Apr 25, 2019 •

edited by pr-preview bot

Loading

gkellogg commented Apr 25, 2019

rubensworks commented Apr 25, 2019

gkellogg commented Apr 25, 2019

gkellogg commented Apr 25, 2019

dlongley commented Apr 29, 2019

gkellogg commented Apr 29, 2019

gkellogg commented Apr 30, 2019

rubensworks commented Apr 30, 2019

iherman commented Apr 30, 2019

pchampin commented Apr 30, 2019

pchampin commented Apr 30, 2019

gkellogg commented Apr 30, 2019

Issue 66 html context #83

Issue 66 html context #83

Conversation

gkellogg commented Apr 25, 2019 • edited by pr-preview bot Loading

gkellogg commented Apr 25, 2019

rubensworks commented Apr 25, 2019

gkellogg commented Apr 25, 2019

gkellogg commented Apr 25, 2019

dlongley commented Apr 29, 2019

gkellogg commented Apr 29, 2019

gkellogg commented Apr 30, 2019

rubensworks commented Apr 30, 2019

iherman commented Apr 30, 2019

pchampin commented Apr 30, 2019

pchampin commented Apr 30, 2019

gkellogg commented Apr 30, 2019

gkellogg commented Apr 25, 2019 •

edited by pr-preview bot

Loading