-
Notifications
You must be signed in to change notification settings - Fork 23
Make processing of embedded HTML normative #57
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I believe we must do that. Embedded JSON-LD is, currently, the only way schema.org data is used, and there are a number of application (e.g., Web Publications) where this is the sensible way to go. My take on the the questions:
I see two options:
Both approaches provide a clear specification; I am more favor of No. 1
If we define that for HTTP, then I guess it is necessary to follow that, yes.
I do not think we should go there. Per HTML spec, the
Yes. I believe if, for example, somebody uses |
@iherman I have to disagree a bit about which approach to take for multiple But I can also imagine situations in which (e.g. via CMS action) many So if we do make processing JSON-LD in HTML normative, do we need to offer a mechanism by which one or more (up to all) |
I agree that if an HTML has multiple script elements that they should all be considered and merged into a common dataset. My own RDFa processor looks for any script element with a type attribute associated with an RDF reader, along with Microdata and RDF/XML and extracts triples from all. The issue about choosing among script tags was surfaced for the use case where the context references an HTML document with embedded JSON-LD script(s). In this case, which one would be used as the context, or would they all be used? |
Just off the top of my head, I would be a bit worried about a merge in that situation because at least one of those |
@ajs6f I do sympathize with accepting several scripts, but I am not sure we have a clear story on how we would merge several JSON-LD snippets into one; hence my original proposal of keeping it to one. Would they be like several top level JSON-LD objects in an array? Are the JSON content simply concatenated as strings? What would the user expect? I am fine accepting several scripts if we have a clear story on this. |
I guess what you do is to merge these as RDF Graphs. This is also what I do in my RDFa+microdata processor. We can of course do that for several scripts, too, but I am a bit concerned whether this is something working with our user audience... |
@iherman You make a good point. For instance documents, we can indeed go to RDF merge, but contexts... have to think about that! 🤔 |
I've actually fielded Linter issues because of automatic creating of many (100's) of JSON-LD scripts in a document; I needed to encourage them to consolidate, but yes, it can happen for SEO. |
@iherman for the record, JSON-LD is the recommended way, but Google (at least) supports RDFa and Microdata for Schema.org extraction: https://developers.google.com/search/docs/guides/intro-structured-data#structured-data-format Additionally, Bing only recently (this past quarter) added JSON-LD support, but prior to that processed both RDFa and Microdata (afaik). Lastly, Open Graph Protocol is popular with sites targeting "social embedding" on Facebook, LinkedIn, etc (it's even in use on this page). Consequently, I'd love to explore a WG Note (or some such) that helps resolve some of the vagueness around mixing these things together (which happens often). |
Ignoring (for now) the inherent risks of depending on embedded JSON-LD for storing (and extracting) a context expression from within HTML, we could "upgrade" the https://www.w3.org/ns/json-ld#context string from only defined as a link relationship (as currently defined) and expand it to include using it as a <script type="application/ld+json;profile=https://www.w3.org/ns/json-ld#context">
{"@context": {}}
</script> That could have interesting potential use in future markup-based graph expressions also--one can imagine an RDFa 2.0 which could lean on JSON-LD based contexts so that any expressed graph content maps to the same names throughout the document. But now I'm probably day dreaming. 😁 |
Currently, Embedding JSON-LD in HTML Documents is entirely informative. We've discussed making this normative, requiring JSON-LD processors to be able to identify and extract JSON-LD from a script tag with type
application/ld+json
within the HTML document.application/ld+json;version=1.1
)html>head>base@href
xml:base
of closest ancestor elementContent-Language
@lang
,@xml:lang
The text was updated successfully, but these errors were encountered: