Skip to content

avoid constraining HTTP #51

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
dret opened this issue Jul 7, 2015 · 29 comments
Closed

avoid constraining HTTP #51

dret opened this issue Jul 7, 2015 · 29 comments
Labels

Comments

@dret
Copy link
Member

dret commented Jul 7, 2015

http://www.w3.org/TR/2015/WD-annotation-protocol-20150702/#http-requirements has MUSTs in there that try to constrain HTTP servers. this is not something that HTTP servers reasonably can be required to do. more specifically, clients should have no knowledge of "specific servers" anyway; they simply follow links and interact via HTTP to accomplish application goals. they may interact with one or various "specific servers" along the way, and the web thrives because clients are not tightly coupled to specific servers. clients send self-contained HTTP requests and then have to handle requests individually. no assumptions should be made that go beyond the single request/response scope.
for example, the WD says "All supported methods for interacting with the Annotation Container MUST be advertised in the Allow header of all responses from the container." this constrains HTTP which defines a MAY (http://tools.ietf.org/html/rfc7231#section-7.4.1), and it does not accomplish anything because in the end, clients can try any method and servers can change their minds between request/response interactions. so in the end maybe "Allow" can be a helpful hint, but it's optional, not reliable, and clients still have to deal with servers responding with 405s (which per HTTP spec then MUST have "Allow").

@tilgovi
Copy link
Contributor

tilgovi commented Jul 7, 2015

👍

@azaroth42
Copy link
Collaborator

👎

The point of the protocol is to provide additional constraints to make development easier, by making such probes unnecessary. The protocol document does not constrain Web Servers that implement HTTP, it constrains Annotation Servers that implement the specification. Without additional constraints there's no need for a protocol document at all, we would just reference the HTTP RFCs and be done with it.

Most of our requirements are inherited from http://www.w3.org/TR/ldp/ which, admittedly has weaseling along these lines:

Per [RFC7231], this HTTP method is optional and this specification does not require LDP servers to support it. When a LDP server supports this method, this specification imposes the following new requirements for LDPCs.

(From section 5.2.3)
We could also do that, but it seems redundant and unhelpful. What use is an Annotation Server that doesn't support the creation of Annotations?

@dret
Copy link
Member Author

dret commented Jul 7, 2015

LDP does not weasle, it's doing things right. i put quite a bit effort into trying to explain to the group that instead of defining a new protocol (and thus tightly couple clients and servers speaking that particular dialect), a much better way to go is to use HTTP as the application protocol. that's pretty much all what REST in its web flavor is about. clients shouldn't make any server-specific assumptions or have to know about them. if you're going this way, you're losing the webbyness of the web. simply document the concepts that clients should now about (media types, header fields, link relations, and so forth), and that's your "protocol".
https://github.com/dret/sedola is the (admittedly pretty incomplete so far) attempt to support this style of "service description": it allows services to publish inventories of web-level concepts (so far: media types, header fields, link relations), and that's all that's needed. the protocol is HTTP.
if you want to publish best practices for implementations, that's a different thing, and then you can say that implementations SHOULD always add "Allow", because that's a nice thing to do. but that's a very different thing from defining a protocol that's defining the rules that clients and servers must live by, or things will break.
and as i said, you don't get anything out of this anyway, other than tight coupling and brittleness. clients still have to deal with 405, since by definition there is nothing bigger in scope than a single request/response interaction. clients still have to assume that in between receiving an "Allow", and sending a request using a listed method, the server may have changed its mind and no longer support the method. you simply cannot change this aspect of HTTP.

@azaroth42
Copy link
Collaborator

A proposal of the required changes for this issue would be helpful in evaluating it.

@dret
Copy link
Member Author

dret commented Jul 7, 2015

without having read the whole document, here are two starting points:

  • remove all of 4.1.1 ("use HTTP as the application protocol").
  • if you want to make recommendations for implementations, move them to an informative appendix or a separate document.

i may have time for a more complete review, but i think that only makes sense if there is some alignment in terms of the general approach being taken.

@iherman
Copy link
Member

iherman commented Jul 8, 2015

I must admit I am not sure I understand all that. If that is the direction we'd go, that means that an annotation client has to implement loads of extra things to ensure that it works with the server as expected. This pushes the load to the client side, which is not what we want.

If, somehow, the client identifies a server to be an annotation server and not just a server in the wild, then it should rely on the restrictions described in 4.1.1. A Rob put it, what is, otherwise, the sense of the whole specification?

There is an 'if', of course, namely how does the client know that a server is not just a lambda HTTP server, but one that abides to the rules (restrictions:-) of the Annotation Protocol. I am not sure there is a way in the document at this moment.

Ivan

On 07 Jul 2015, at 23:10 , Erik Wilde [email protected] wrote:

without having read the whole document, here are two starting points:

• remove all of 4.1.1 ("use HTTP as the application protocol")
• if you want to make recommendations for implementations, move them to an informative appendix or a separate document. i may have time for a more complete review, but i think that only makes sense if there is some alignment in terms of the general approach being taken.

Reply to this email directly or view it on GitHub.


Ivan Herman, W3C
Digital Publishing Activity Lead
Home: http://www.w3.org/People/Ivan/
mobile: +31-641044153
ORCID ID: http://orcid.org/0000-0003-0782-2704

@dret
Copy link
Member Author

dret commented Jul 8, 2015

@iherman, why would a client want to "identify a server to be an annotation server"? does a browser have to "identify a server to be an (image|script|form) server" that implements a special flavor of HTTP? that would be rather bad and fragment the web. instead, you use HTTP constructs to get the job done, either what's in the vanilla spec using media types that work for you, or you mix in extra parts such as additional HTTP header fields that help to get the job done better. everything else is non-webby, and it would be sad to see a W3C spec go this way. HTTP is the application protocol of a web service, and trying to define "extended subsets" of it is a rather unfortunate anti-pattern.
and yes, i do have a hard time seeing the sense of the whole spec as it is. document the way in which you're using the standard parts of web architecture you're using (media types, header fields, link relations, and so forth), and that's it. if you feel like you need extra parts that don't yet exist, define them in the same way as LDP defines Accept-Post (http://www.w3.org/TR/ldp/#header-accept-post) because the group felt the need to expose that specific information.

@iherman
Copy link
Member

iherman commented Jul 8, 2015

On 08 Jul 2015, at 09:23 , Erik Wilde [email protected] wrote:

@iherman, why would a client want to "identify a server to be an annotation server"? does a browser have to "identify a server to be an (image|script|form) server" that implements a special flavor of HTTP?

I do not think this is a fair comparison. Browsers are, in this sense, general purpose pieces of software and their role is to accept and do something with practically any type of data that is accessible through HTTP. As a consequence (although not only for that reason of course) they are a formidably complex pieces of software.

The goal in our case is to strive for simplicity. Ie, an annotation client should know about annotation related structures that we define, and nothing else. It should not take years of development efforts to do it, it should be simple. That means it does have a very restricted, or, if you like, focused knowledge of the world.

Anything that goes on through the Annotation protocol (or LDP protocol, for that matter) can be interpreted by a generic HTTP client/server. And that is fine. But only specific content of the information flowing through is handled by the LDP or the Annotation client in a specific way. What these specification define is that extra 'constraint' on what goes through the wire to achieve a specific functionality. I do not see what is wrong with that, I must admit.

that would be rather bad and fragment the web. instead, you use HTTP constructs to get the job done, either what's in the vanilla spec using media types that work for you, or you mix in extra parts such as additional HTTP header fields that help to get the job done better. everything else is non-webby, and it would be sad to see a W3C spec go this way. HTTP is the application protocol of a web service, and trying to define "extended subsets" of it is a rather unfortunate anti-pattern.
and yes, i do have a hard time seeing the sense of the whole spec as it is. document the way in which you're using the standard parts of web architecture you're using (media types, header fields, link relations, and so forth), and that's it. if you feel like you need extra parts that don't yet exist, define them in the same way as LDP defines Accept-Post (http://www.w3.org/TR/ldp/#header-accept-post) because the group felt the need to expose that specific information.


Reply to this email directly or view it on GitHub.


Ivan Herman, W3C
Digital Publishing Activity Lead
Home: http://www.w3.org/People/Ivan/
mobile: +31-641044153
ORCID ID: http://orcid.org/0000-0003-0782-2704

@tilgovi
Copy link
Contributor

tilgovi commented Jul 8, 2015

I think we need to talk about specific changes to the document because I'm not convinced there's actually disagreement here.

@tilgovi
Copy link
Contributor

tilgovi commented Jul 8, 2015

For instance, I think there's some slippage and ambiguity of the term "protocol" around this discussion.

@fhirsch
Copy link
Contributor

fhirsch commented Jul 8, 2015

+1 to tilgovi on his last two points. Not defining a 'new protocol' rather identifying constraints and usage for annotation. Concrete draft edits would be helpful.

@dret
Copy link
Member Author

dret commented Jul 8, 2015

@iherman, the question is whether you want this to be a web-level thing, or something that allows implementations to take shortcuts that are not allowed per HTTP. http://tools.ietf.org/html/rfc5023#section-4.4 (and actually all of the RFC) is a good blueprint: in the end it mostly documents the media type, and then says that anything that's allowed by HTTP is fair game and must be properly dealt with by clients (and i think this is what @fhirsch is asking for). that's how you become a part of the web. if you specifically allow implementers to take shortcuts that are not in line with the web, then you tightly couple implementations.
why would you want to assume that there even is such a thing as an "annotation server", and clients should be able to find out that they are talking to such a thing? why wouldn't publishers set up servers that happily serve AtomPub, annotations, LDP, and who knows what else, and all they would have to do is make sure that this is a well-behaving HTTP server serving those resources with the appropriate behavior? when you make special rules, you fragment the web, and that would be a sad thing to do.
developing a client really does not take that much effort. there are HTTP libraries in every conceivable language, and you use those parts that are meaningful to you, and handle them according to HTTP.
@tilgovi, without trying to start a philosophical debate here: a protocol as i use the term is a set of conventions that allows peers to interact to accomplish some goal. the single greatest aspect of the web is that everybody on the web speaks the same protocol, so i never need to care who i am talking to, as long as they are speaking HTTP. if a spec constrains HTTP and expects servers to always follow those non-HTTP rules, then clients start taking shortcuts, and then those clients will break when somebody wants to use a standard HTTP server to serve the protocol. you have then effectively partitioned the web, and i think you have already discovered that when you tried to come up with a way to "discover" that a server is a special annotation server.
all i can say is: don't do it. it's bad for the web, and bad for what you're trying to do.

@tilgovi
Copy link
Contributor

tilgovi commented Jul 8, 2015

I think the basic point @dret is making is that even if a server doesn't advertise support for certain things the client could still try throwing things at it and they might stick.

But also, there seems to be a question we haven't answered here. I think one answer motivates the MUSTs.

Must an annotation container only contain annotations?

Even if the answer is "yes" then I think SHOULD might be more appropriate. If the answer is "no" then the MUST is definitely not appropriate.

@BigBlueHat
Copy link
Member

@dret I posted a longer form version of this to the Web Annotation list directly, but thought I'd recap here.

It seems the misstep here is that we're requiring write semantics in a way that HTTP does not and essentially turning Allow into a greater expectation by the client than is healthy.

In which case, @tilgovi's suggestion of switching from MUST to SHOULD in 4.1.1 should do the trick.

Does that sound correct, @dret? Would that allay your concerns?

@dret
Copy link
Member Author

dret commented Jul 9, 2015

HTTP defines this as a MAY. therefore that's all that clients can expect in an HTTP world. expecting more leads to bad clients that break in an HTTP world because developers will take shortcuts and write code that incorrectly assumes that taking Allow as a promise is a reasonable thing to do. HTTP says that whatever methods a resource supports MAY be exposed via Allow, and that's enough. there is no way how clients can depend on this information anyway, they still have to be coded in a way that can deal with 405 responses. if you encourage clients to do something else, you encourage the development of broken clients.
it's like @azaroth42 said, in theory the "protocol" could simply point to HTTP, and that's all that would be required. in practice, it can be helpful to have a document that explains everything in context, so that developers can understand which concepts of HTTP, which media types, and which additional aspects of the web they should take into account when dealing with the specific service. but that's really not more than a convenience, because in the end, all of the moving parts in the service need to be specified somewhere (media types including processing models and so forth), and putting those in context in a single document is simply helpful.
as a side note: personally, i have started calling the media type the protocol, because it defines the ways in which peers have to process and understand data, and in which they can engage in protocol conversations (by following links). this is not entirely correct, because technically speaking, HTTP is the actual application protocol (and the only thing that is required to engage in successful individual interactions), but it is the protocol at a slightly higher level when it comes to conversations that go beyond single request/response interactions, and talking about how clients might want to achieve certain application goals.

@tilgovi
Copy link
Contributor

tilgovi commented Jul 9, 2015

I think what is needed is some concrete edits. I don't object to anything Erik has said. I certainly see the value in making weaker statements as a way to encourage authors to code defensively against broken promises.

@dret
Copy link
Member Author

dret commented Jul 9, 2015

(me being a broken record): on the web, not setting Allow is not a "broken promise". it's perfectly acceptable.

@tilgovi
Copy link
Contributor

tilgovi commented Jul 9, 2015

You're also being pedantic. I'm trying to argue for your edits Erik, such as I imagine them. Help me out.

Part of your argument is about broken promises:

there is no way how clients can depend on this information anyway, they still have to be coded in a way that can deal with 405 responses

If that's not about broken promises, then I have no idea what you're saying.

I understand it's not only about that. But it is also about that.

@tilgovi
Copy link
Contributor

tilgovi commented Jul 9, 2015

A little more, "yes, and" and a little less broken record, please. :)

@dret
Copy link
Member Author

dret commented Jul 9, 2015

i did not want to criticize your last comments, @tilgovi. i just wanted to say that i would avoid using language that implies that clients should make any assumptions beyond what HTTP defines. i think we're in violent agreement here.

@tilgovi
Copy link
Contributor

tilgovi commented Jul 9, 2015

I think so, too :). Sorry if I got defensive.

@tilgovi
Copy link
Contributor

tilgovi commented Jul 9, 2015

@azaroth42 what's the best way to suggest concrete edits? PR? Annotation?

@azaroth42
Copy link
Collaborator

At this stage, I think annotations would be perfect if possible. My concern with a PR is that some edits might be more acceptable to the WG than others, and we can tick them off more easily as replies to annos than lots of little PRs or issues.

If only we had a way to create an edit annotation that we could accept and it auto-generated a git PR for the replacement ... ... ... 😸

@iherman
Copy link
Member

iherman commented Jul 10, 2015

@dret, I must admit I am still lost, although I try to understand what you say.

I looked, taking this discussion into account, to the LDP spec with fresh eyes. I see the repeated pattern (as mentioned by @azaroth42) saying

When the LDP server supports this method, this specification imposes new requirements for LDPCs

but, I must admit, I still miss some niceties here. I think I get that you do not want to impose, e.g., the existence of a specific verb (say, POST), so you surround it with this statement. But, nevertheless, the LDP specification does have a whole series of SHOULD-s and MUST-s in, say, section 4.2.4 which reads as a restriction of a LDP server for me in specifying, beyond the corresponding RFC-s, e.g., what the response should look like (if that specific method is available, that is). Admittingly less stringent than what we have in the protocol draft, but restrictions nevertheless.

I think the only way to move forward is really to see some examples for the kind of changes you'd like to see and see what the consequences would be for an annotation client. I must admit, on a very pragmatic level, if the development of a client would become a factor more complicated as a result, I would be genuinely worried.

(I will be on vacations for a while, though, so I may not be in position to respond to any replies... sorry about that.)

@dret
Copy link
Member Author

dret commented Jul 10, 2015

i hope i never created the impression that LDP is something that should be used as a blueprint here. i think that LDP has the same problems of not clearly separating the fact that HTTP is all that clients should have to know about, and what it takes for servers to serve LDP resources.
all i wanted to do is make the group aware of the fact that constraining standards is an anti-pattern when it comes to defining open web standards. if you want to create a separate ecosystem where peers can operate based on a set of rules different from the open web (that you chose to make simpler because you felt that some of the rules of the open web were unnecessarily complicated to follow in your case), then that's an entirely different issue. just be aware of the choice you make, it makes a difference.

@azaroth42
Copy link
Collaborator

@dret -- Could you have a look at http://w3c.github.io/web-annotation/protocol/wd/ and let us know whether it has solved this issue for you?

As far as I can tell, all of the requirements are now either format related (e.g. we're talking about Annotations, not generic web resources) or inherited from either HTTP (you have to have Vary) or LDP (link headers, paging etc.)

Thanks!

@dret
Copy link
Member Author

dret commented Aug 27, 2015

i don't want to hold up this work, so i will rest my case. but i am still confused why there even is a "protocol spec" talking about a server, and not just a mediatype/vocabulary spec. for example, right now it seems everything hinges on the fact that everything must be implemented by "the server". what if somebody for whatever reason serves different resources from different servers (physically or by switching domain names in some way)? does that make such an implementation non-conforming because there isn't one server satisfying all requirements? REST would say that such an implementation should be fine, and only broken clients (not following links but instead string-processing URIs) would have problems. but this is a very high-level question and i don't think one that can be discussed productively on github. so from a process point of view, feel free to consider this issue resolved.

@akuckartz
Copy link

👍 (I agree with @dret and think that it generally is good to be pedantic while creating a standard)

@azaroth42
Copy link
Collaborator

Closing per @dret's most recent comment, and agree that implementations should be able to use multiple servers. Will raise a separate more specific issue regarding LDP and this point.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

7 participants