Skip to content

Should rel="alternate" keep the base IRI of the original document? #139

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
pchampin opened this issue Aug 23, 2019 · 5 comments · Fixed by #142
Closed

Should rel="alternate" keep the base IRI of the original document? #139

pchampin opened this issue Aug 23, 2019 · 5 comments · Fixed by #142

Comments

@pchampin
Copy link
Contributor

This issue was first raised in a comment to #133, which was unfortunately not located on the right lines of the patch. It should have been about line 5833.

To summarize, PR #133 states that, when dereferencing a URL to a non JSON-LD content, a JSON-LD must follow a Link HTTP header with rel="alternate" and type="application/ld+json", if available. This is perceived as some kind of "client-side content management", and I am very happy with it.

My concern is about line 5833, which states that the processor must then behave as if the target of the Link had been directly retrieved from the original URL (i.e. the source of the Link).

So if I get text/html from http://example.org/, with a Link to https://example.org/data/context.jsonld, and the latter contains "@id": "#foo", this should be interpreter as http://example.org/#foo, rather than https://example.org/data/context.jsonld#foo.

If that was transparent content-negogication (i.e. the client never explicitly fetches the second URL), then that would be ok. But as there is an explicit fetch of the second URL, this is more akin to 303 redirect, or more precisely, to content-negociation + 303 redirect as used by many RDF sources (such as DBPedia) and described by Jeni Tennison.

And as far as I can tell, there is no precedent where relative IRIs after a redirect are interpreted with the base of the source (i.e. as if the redirect had not happened). While I see how it makes sense from the point of view of semantics, I'm concerned that this would make JSON-LD processors behave differently from other HTTP clients.

@BigBlueHat
Copy link
Member

Good thoughts, @pchampin. There's also a potential confusion around multiple initial URLs using this rel="alternate" approach to result in the same context document.

If we keep the original request URL as the (initial) base value, then one might end up with one context document (processed from different originating URLs) outputting very different IRIs for its terms.

Think http://example.com/schema/ vs. http://example.com/schema/v2/ both of which might use a rel="alternate" pointed at http://example.com/schema/context.jsonld. Without that document using @base or absolute IRIs, the output IRIs would be prefixed with either of those URLs...and never the context document.

So...leaving the Request-URI the same after the rel="alternate" is handled does indeed have risk and confusion potential...

In the end, I think we're running up against something similar to "being vague (or conditional) about base is bad." 😃

@rubensworks
Copy link
Member

I share this concern. My gut feeling also says that it makes more sense to have the base set to the linked URL.

As this is is really about the client following links, the client should IMO not be expected to remember the state of the previous request to determine the base of the next request. As far as I know, there are no cases where documents should be handled differently based on what links were followed to reach it (please tell me if those do exist). This feels very non-RESTful to me.

Building upon @pchampin's example from above,
this would mean that the content of https://example.org/data/context.jsonld would mean something different if it was loaded via https://example.org/data/context.jsonld directly, versus when it was loaded via https://example.org/.

@BigBlueHat
Copy link
Member

As far as I know, there are no cases where documents should be handled differently based on what links were followed to reach it (please tell me if those do exist). This feels very non-RESTful to me.

Content Negotiation. But that is explicitly serving multiple "equivalent" resource representations, so they should all share the same link semantics, and if they don't, the error is the authors...not the HTTP clients.

@pchampin
Copy link
Contributor Author

This feels very non-RESTful to me.

Playing the devil's advocate, in REST, it is the client's responsibility to keep track of the application state, so if anything, this requirement would be quite RESTful.

It is also semantically valid, as the "alternate" resource to which we are pointing is really just another representation of the original resource. In that respect, it is very similar to content negotiation, as @BigBlueHat (and others) already pointed out.

That is why my suggestion is to encourage instead context authors who use the rel="alternate" trick, to explicitly set the @base of the JSON-LD document to the "canonical" URL of the context (the one at the origin of the Link). This makes it clear that the "physical" location of the JSON-LD document is indeed an implementation detail, and that way the correct base will be used regardless how the context document was reached. Explicit is better than implicit ;-)

@iherman
Copy link
Member

iherman commented Aug 23, 2019

This issue was discussed in a meeting.

  • RESOLVED: Agree with the issue and by default associate base with the redirected URL not the original
  • ACTION: add use absolute URIs to BPs (Benjamin Young)
  • ACTION: add use absolute URIs to BPs (Adam Soroka)
View the transcript Rob Sanderson: API issue #139
Pierre-Antoine Champin: In the current API spec, it says whenever a rel="alternate" link is followed the base URL is set to the original URL. Meaning that relative URLs in a context document should be resolved as if found in the HTML document that is the source of the link.
… I see this partly as content negotiation which would advocate for this but also as a redirection because the client has to explicitly follow a link. I don’t know of such a situation that causes this to happen. I understand the rationale but it has no precedent.
… And it creates a strange exception and that’s my concern.
Pierre-Antoine Champin: dlongley: as I understand, Gregg thought that was consistent with how things work elsewhere.
Pierre-Antoine Champin: … And what we are already doing.
Pierre-Antoine Champin: … My position would be: let’s not do something that is exceptional.
Pierre-Antoine Champin: … But let’s not break what we are doing now either.
Rob Sanderson: also +1 to dlongley
Rob Sanderson: and to pchampin / bigbluehat about setting default @base
Benjamin Young: +1 to what Dave just said but it’s also about setting the default base. Maybe we can follow this up with a note around it. Context authors should be explicit about setting their own base or using URLs for their contexts because there are a host of situations where base is flaky.
… Publishing a context where you’re not being explicit is dangerous anyway already – and these things, certainly both of them combined, make it much less dependable. It could come with a declaration that context authors need to be explicit about their base or use full URLs.
Adam Soroka: +1 to what Dave and Ben said. If I understand correctly we’re talking about a link between two resources, such that you traverse that link you’re going to interpret base with the original document. You could in theory interpret with a different base – that’s not some kind of world is going to fall, but it opens the door to a lot of potential confusion. Especially in complex systems where you get a lot of reuse.
Ivan Herman: I come back to what Benjamin said – if there is an @base in the @context document doesn’t that set the base that refers to the document.
… If it does, then we shouldn’t do that.
Benjamin Young: The @base only works for the document it’s in, not in the @context, only works in an inline context, not for a remote context.
Ivan Herman: Ok.
Benjamin Young: It’s not about removing rel="alternate" it’s about not setting base to the original request URI but setting it to the redirect meaning rather than the original one.
Pierre-Antoine Champin: dlongley: in schema.org, I think that all the URLs are explicit. I’m pretty sure they would want their base URL be http://schema.org, not any URL they redirect to.
Pierre-Antoine Champin: … Let’s make sure we are not shooting ourselves in the foot.
Ivan Herman: schema.org uses only absolute URLs.
Benjamin Young: Part of this closing proposal – should we add something in the NOTE or something … that context authors should always be explicit with their URLs because it’s risky at best.
Ivan Herman: Whomever is the editor of the best practices document should make a note.
Action #4: add use absolute URIs to BPs (Benjamin Young)
Action #5: add use absolute URIs to BPs (Adam Soroka)
Proposed resolution: Agree with the issue and by default associate base with the redirected URL not the original (Rob Sanderson)
Rob Sanderson: +1
Ivan Herman: +1
Ruben Taelman: +1
Dave Longley: +1 (hopes this is the right decision)
Pierre-Antoine Champin: +1
Benjamin Young: +1
Resolution #3: Agree with the issue and by default associate base with the redirected URL not the original
Adam Soroka: +1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants