Skip to content

net/http: request context cancelled on readtimeout, persists across connection reuse #70834

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
johnmaguire opened this issue Dec 13, 2024 · 7 comments
Labels
NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.

Comments

@johnmaguire
Copy link

johnmaguire commented Dec 13, 2024

Go version

go version go1.23.4 darwin/arm64

Output of go env in your module/workspace:

GO111MODULE=''
GOARCH='arm64'
GOBIN=''
GOCACHE='/Users/jmaguire/Library/Caches/go-build'
GOENV='/Users/jmaguire/Library/Application Support/go/env'
GOEXE=''
GOEXPERIMENT=''
GOFLAGS=''
GOHOSTARCH='arm64'
GOHOSTOS='darwin'
GOINSECURE=''
GOMODCACHE='/Users/jmaguire/go/pkg/mod'
GONOPROXY='github.com/DefinedNet'
GONOSUMDB='github.com/DefinedNet'
GOOS='darwin'
GOPATH='/Users/jmaguire/go'
GOPRIVATE='github.com/DefinedNet'
GOPROXY='https://proxy.golang.org,direct'
GOROOT='/opt/homebrew/Cellar/go/1.23.4/libexec'
GOSUMDB='sum.golang.org'
GOTMPDIR=''
GOTOOLCHAIN='local'
GOTOOLDIR='/opt/homebrew/Cellar/go/1.23.4/libexec/pkg/tool/darwin_arm64'
GOVCS=''
GOVERSION='go1.23.4'
GODEBUG=''
GOTELEMETRY='local'
GOTELEMETRYDIR='/Users/jmaguire/Library/Application Support/go/telemetry'
GCCGO='gccgo'
GOARM64='v8.0'
AR='ar'
CC='cc'
CXX='c++'
CGO_ENABLED='1'
GOMOD='/dev/null'
GOWORK=''
CGO_CFLAGS='-O2 -g'
CGO_CPPFLAGS=''
CGO_CXXFLAGS='-O2 -g'
CGO_FFLAGS='-O2 -g'
CGO_LDFLAGS='-O2 -g'
PKG_CONFIG='pkg-config'
GOGCCFLAGS='-fPIC -arch arm64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -ffile-prefix-map=/var/folders/2t/rbxbv8612sq9_rdrxhtw8rzc0000gn/T/go-build1060450768=/tmp/go-build -gno-record-gcc-switches -fno-common'

What did you do?

This code demonstrates an issue in Go HTTP/1.1 request handling when the following conditions are true:

  • An incoming request has an empty body
  • A ResponseController is created and SetReadDeadline is called
  • The time spent processing the request exceeds the read deadline

When these conditions are true, the HTTP server will erroneously cancel the request context. Additionally, a substantial percentage of future requests made on this HTTP connection will have their context immediately canceled.

To reproduce the issue, run the script found here: https://go.dev/play/p/wiX9Fyt2S2s

What did you see happen?

The expected result is that both requests return 200 OK. Instead, both requests return 408 Request Timeout.

What did you expect to see?

We expect a 200 OK for the first request because the docs for func (*ResponseController) SetReadDeadline state:

SetReadDeadline sets the deadline for reading the entire request, including the body. Reads from the request body after the deadline has been exceeded will return an error. A zero value means no deadline.

Setting the read deadline after it has been exceeded will not extend it.

Since the request contains no body to read, the read deadline should not have been exceeded.

A secondary problem is that when HTTP keep-alive is enabled, future requests made against this connection (which is not closed by the server) will also be immediately canceled when processing starts.

In the proof of concept, we expect a 200 OK for the second request not only because is there no body to read, but this request uses a fast endpoint which returns well within the read deadline. Instead, the context is canceled as soon as we start processing the request. None of the conditions listed under the docs for func (*Request) Context are true:

For incoming server requests, the context is canceled when the client's connection closes, the request is canceled (with HTTP/2), or when the ServeHTTP method returns.

While we have not root-caused the issue with the read deadline being exceeded when there is no body (update: see comment #70834 (comment) below), we believe that the connection is not closed correctly for this read error due to the this context being unchecked for cancellation in the for loop directly below:

go/src/net/http/server.go

Lines 2008 to 2010 in 8391579

ctx, cancelCtx := context.WithCancel(ctx)
c.cancelCtx = cancelCtx
defer cancelCtx()

It is canceled by the call to handleReadError here:

cr.handleReadError(err)

In our testing, some requests do succeed on the reused connection, but many fail with immediate context cancellation. Attempting to manually read the (nonexistent) body in the handler prior to the read deadline does not prevent context cancellation (e.g. io.Copy(io.Discard, r.Body).) If the client sends any data in the body, this issue does not occur (set -body flag.) Additionally, the HTTP/2 server is not affected by the read deadline bug nor the cross-request state contamination bug (set -http2 flag.)

Workaround

We are catching context.Canceled and calling r.Header.Set("Connection", "close") to ensure that the connection is not reused. We are also contemplating checking the Content-Length header to ensure it's non-zero before calling SetReadDeadline.

Related Issues

@johnmaguire
Copy link
Author

The first related issue mentioned by @gabyhelp (#18447) is a bit interesting as it complains that the context should not be cancelled in the event of Server.ReadTimeout, as this is not mentioned in the docs for func (r *Request) Context():

For incoming server requests, the context is canceled when the client's connection closes, the request is canceled (with HTTP/2), or when the ServeHTTP method returns.

Indeed, I did wonder if the error returned for a read deadline hit should be "context deadline exceeded" instead. Regardless, it does seem correct that the request (and its context) would be canceled when the deadline is hit, so perhaps the docs simply need adjustment.

@johnmaguire
Copy link
Author

johnmaguire commented Dec 13, 2024

SetWriteDeadline also fails to close the connection to the client, resulting in a failed second request: https://go.dev/play/p/mXJKXWzFjue Scratch that, write deadlines appear to work correctly.

@seankhliao seankhliao changed the title net/http: context is erroneously canceled for requests that run longer than the read deadline when no body is present, then future requests against the connection fail net/http: request context cancelled on readtimeout, persists across connection reuse Dec 14, 2024
@brad-defined
Copy link

I believe the issue is that the ResponseController allows the request handler to modify the read timeout of the TCP connection in use by backgroundReader, which doesn't expect the timeout value to be anything other than 0.

Stepping through the code in src/net/http/server.go:

// Serve a new connection.
func (c *conn) serve(ctx context.Context) {

A new context is created to track the lifetime of the series of requests made over the connection:

	// HTTP/1.x from here on.

	ctx, cancelCtx := context.WithCancel(ctx)
	c.cancelCtx = cancelCtx
	defer cancelCtx()

Note that this context is a parent context to every individual request context created over this connection:

	for {
		w, err := c.readRequest(ctx)

Before the request is served, in some conditions, a background reader is started in a new goroutine:

		if requestBodyRemains(req.Body) {
			registerOnHitEOF(req.Body, w.conn.r.startBackgroundRead)
		} else {
			w.conn.r.startBackgroundRead()
		}

The background reader sets the read deadline to 0 and invokes the background read in a new goroutine:

func (cr *connReader) startBackgroundRead() {
<snip>
	cr.conn.rwc.SetReadDeadline(time.Time{})
	go cr.backgroundRead()
}

After the background reader is started, the ServeHTTP method is called:

		// HTTP cannot have multiple simultaneous active requests.[*]
		// Until the server replies to this request, it can't read another,
		// so we might as well run the handler in this goroutine.
		// [*] Not strictly true: HTTP pipelining. We could let them all process
		// in parallel even if their responses need to be serialized.
		// But we're not going to implement HTTP pipelining because it
		// was never deployed in the wild and the answer is HTTP/2.
		inFlightResponse = w
		serverHandler{c.server}.ServeHTTP(w, w.req)
		inFlightResponse = nil
		w.cancelCtx()

In the example code provided in this ticket, the ServeHTTP method calls SetReadDeadline(time.Now().Add(1 * time.Second)). This call modifies the read deadline on the underlying TCP connection from the background reader's value of 0 to a new value of 1 second in the future.

This new deadline interferes with the background reader, which hits the new read timeout set by the hander. Its error handling code ultimately cancels the outer context used as the parent of every request object context.

func (cr *connReader) backgroundRead() {
// snip
	} else if err != nil {
		cr.handleReadError(err)
	}
// snip...
func (cr *connReader) handleReadError(_ error) {
	cr.conn.cancelCtx() // <- cancels the outer context used in the connection serve() loop
	cr.closeNotify()
}

Despite this outer context now being cancelled, readRequest(ctx) is called again in the next for loop iteration, and it is able to successfully read the next request from the connection. It creates a new request context using the already canceled context as parent. The ticket's handler immediately fails once the context is checked for Done-ness, because the parent context is canceled from the previous iteration.

@dr2chase
Copy link
Contributor

@neild

@dr2chase dr2chase added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Dec 18, 2024
@gopherbot
Copy link
Contributor

Change https://go.dev/cl/637715 mentions this issue: net/http: don't set a read deadline for the background reader

@neild
Copy link
Contributor

neild commented Dec 19, 2024

Thanks for the report.

As @brad-defined says, the problem is that SetReadDeadline can apply a deadline to the connection after the background reader has started. (The background reader reads from a connection to detect connection errors after the request body has been consumed.)

A secondary problem is that the server request read loop can continue to read from a connection after the connection has encountered a read error. For most errors this isn't a problem, since the second read attempt will fail. For timeouts, this is a problem, because resetting the deadline allows the next read to succeed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Projects
None yet
Development

No branches or pull requests

6 participants