Skip to content

Issue 66 html context #83

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 9 commits into from
May 1, 2019
Merged

Issue 66 html context #83

merged 9 commits into from
May 1, 2019

Conversation

gkellogg
Copy link
Member

@gkellogg gkellogg commented Apr 25, 2019

Allow contexts to be HTML documents, with preference towards script elements of type application/ld+json;profile=http://www.w3.org/ns/json-ld#context.

For #66.


Preview | Diff

@gkellogg gkellogg requested review from davidlehn and dlongley April 25, 2019 00:35
@gkellogg
Copy link
Member Author

Note that I didn't add compact tests, as the compact tests take the extracted context, and not a URL. All the important paths are checked here.

We'll want to add text to allow frame documents to be HTML too, for the same reasons of documentation.

@rubensworks
Copy link
Member

Is there a particular reason to only take the first context that was found in the HTML?
If I understand the PR correctly,
a processor would only extract the context from the first script tag in the following:

<html>
<head>
  <script id="context" type="application/ld+json;profile=http://www.w3.org/ns/json-ld#context">
    {
      "@context": {
        "term3": "http://example.com/term3",
        "term4": "http://example.com/term4",
        "term5": "http://example.com/term5"
      }
    }
  </script>
  <script id="context" type="application/ld+json;profile=http://www.w3.org/ns/json-ld#context">
    {
      "@context": {
        "term4": "http://example.com/term4",
        "term5": "http://example.com/term5"
      }
    }
  </script>
</head>
</html>

In situations where Web designers are generating their views from multiple templates,
and different contexts may be contained in different templates,
it may make sense to allow multiple contexts in HTML.

Semantically, this could be processed similarly as context arrays.

@gkellogg
Copy link
Member Author

It is certainly something to consider, but if we fall back to using the profile-less type, I don’t see us taking them all, just the first, so it might seem inconsistent. People should 👍 or 👎 your suggestion.

@gkellogg
Copy link
Member Author

I tried to implement this, and ran into problems. One problem is that several tests fail, which previously passed, as normative text requires that a remote context be a top-level object, not array.

From Context Processing Algorithm step 3.2.4:

If the dereferenced document has no top-level dictionary with an @context member, an invalid remote context has been detected and processing is aborted; otherwise, set context to the value of that member.

I don't think it's worth softening or confusing this text without a compelling use case.

@dlongley
Copy link
Contributor

I can see this approach being fraught with problems, mostly around dynamically generated script tags. Many (most?) people don't build static sites anymore -- which is why the search engines have adapted to run JS to do things like read JSON-LD on the page. Are we going to require the same of JSON-LD processors?

If not, we're relying on a pretty specific scenario under which this will work: A website that serves a static page or at least one where the first script tag with an @context is static. Is the assumption that the people using this feature will do this? Is this a safe/fair assumption?

It seems to me like this might be a feature that's too targeted for the Web of yesterday ... or one that requires much more than the (already heavy) HTML parsing lift to function in a way users expect.

@gkellogg
Copy link
Member Author

The stated purpose of using HTML files for context (and frame) is to be able to document those files, in which case the JSON-LD script elements can be statically generated (perhaps the HTML is dynamically generated from the JSON-LD).

Otherwise, if the JSON-LD is dynamically generated and inserted into a script tag, you may as well content-negotiate to generate the pure JSON-LD file for a direct response.

We certainly can't anticipate every new web technology that may come about, but we can recommend practices for working for JSON-LD in web pages, particularly when the documentatative aspect is primary.

…or a full processor (or not a pure JSON Processor).
@gkellogg
Copy link
Member Author

@BigBlueHat @dlongley @iherman I added a section on processor levels and added text around parts of the algorithm which manipulate HTML to use them. (Also added similar text to w3c/json-ld-syntax#167). See if this satisfies your concerns. If so, I can split out tests into a separate context.

@rubensworks
Copy link
Member

Many (most?) people don't build static sites anymore

Static websites are still very much a thing. Many highly popular tools exist for this (Jekyll (used by GitHub pages), Gastby, Mkdocs, ...), so we should definitely consider this an active domain.

A website that serves a static page or at least one where the first script tag with an @context is static. Is the assumption that the people using this feature will do this? Is this a safe/fair assumption?

In any case, I don't have a strong preference to allow multiple context, but I could imagine that people writing webpages expect this to be possible. For example, CSS allows multiple stylesheets to be referenced, where each next one builds upon the previous one. So logically, a developer may assume that JSON-LD (contexts) work in the same way.

That being said, defining JSON-LD contexts in HTML is a very specific use-case, so I don't think multiple contexts in HTML are a must-have feature, it just would be nice-to-have for developers.

@iherman
Copy link
Member

iherman commented Apr 30, 2019

@BigBlueHat @dlongley @iherman I added a section on processor levels and added text around parts of the algorithm which manipulate HTML to use them. (Also added similar text to w3c/json-ld-syntax#167). See if this satisfies your concerns. If so, I can split out tests into a separate context.

I am basically fine with this. The minor thing that bothers me is that the definitions for pure processor & Co. are in a non-normative sections, whereas the reference to these terms are all in normative sections. Maybe these terms should become normative?

@pchampin
Copy link
Contributor

+1 to @iherman about making "pure JSON processor" and "full processor" normative. However, "event-based JSON processor" should probably not be, because:

  • we decided during the last call not to fully specifiy it,
  • the current text states that only pure JSON processors may reject HTML documents, so it implicitly requires event-based processor to be able to parse HTML. Not all streaming implementations might want this constraint.

@pchampin
Copy link
Contributor

Thinking more about it, I think that mixing the "pure JSON vs. full" distinction with "event-based" is confusing... especially if two out of three are to be defined normatively. "Event based" is not a "level", it is somewhat orthogonal to the others.

I propose to

  • make the "processor levels" section normative (can we mark as normative a subsection of the non-normative Section 1?), possibly by moving it in Section 3: Conformance;
  • move the description of "event-based provessor" somewhere else – although I don't have a clear idea where...

@gkellogg
Copy link
Member Author

I can certainly make the section normative, which will require moving it someplace else. Whether or not the term “event-based JSON Processor” is, itself normative, it’s not referenced from any other normative statements, but the text can elaborate that the term is non-normative, or I can add a non-normative subsection to describe it.

@gkellogg gkellogg merged commit 8cf5a5a into master May 1, 2019
@gkellogg gkellogg deleted the issue-66-html-context branch May 1, 2019 16:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants