Skip to content

Tens of thousands of gRPC goroutines between ruler and ingester #672

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
bboreham opened this issue Jan 27, 2018 · 3 comments
Closed

Tens of thousands of gRPC goroutines between ruler and ingester #672

bboreham opened this issue Jan 27, 2018 · 3 comments
Labels

Comments

@bboreham
Copy link
Contributor

bboreham commented Jan 27, 2018

Symptom: we occasionally get massive numbers of goroutines in the ruler like this:

goroutine profile: total 133338
#   0x809b8a    github.com/weaveworks/cortex/vendor/google.golang.org/grpc/transport.(*recvBufferReader).read+0x28a                 /go/src/github.com/weaveworks/cortex/vendor/google.golang.org/grpc/transport/transport.go:132
#   0x809846    github.com/weaveworks/cortex/vendor/google.golang.org/grpc/transport.(*recvBufferReader).Read+0x66                  /go/src/github.com/weaveworks/cortex/vendor/google.golang.org/grpc/transport/transport.go:121
#   0x80abc4    github.com/weaveworks/cortex/vendor/google.golang.org/grpc/transport.(*transportReader).Read+0x54                   /go/src/github.com/weaveworks/cortex/vendor/google.golang.org/grpc/transport/transport.go:395
#   0x475445    io.ReadAtLeast+0x85                                                         /usr/local/go/src/io/io.go:309
#   0x4755b7    io.ReadFull+0x57                                                            /usr/local/go/src/io/io.go:327
#   0x80ab0e    github.com/weaveworks/cortex/vendor/google.golang.org/grpc/transport.(*Stream).Read+0xbe                        /go/src/github.com/weaveworks/cortex/vendor/google.golang.org/grpc/transport/transport.go:379
#   0x831ec4    github.com/weaveworks/cortex/vendor/google.golang.org/grpc.(*parser).recvMsg+0x64                           /go/src/github.com/weaveworks/cortex/vendor/google.golang.org/grpc/rpc_util.go:285
#   0x832f0c    github.com/weaveworks/cortex/vendor/google.golang.org/grpc.recv+0x4c                                    /go/src/github.com/weaveworks/cortex/vendor/google.golang.org/grpc/rpc_util.go:394
#   0x820965    github.com/weaveworks/cortex/vendor/google.golang.org/grpc.recvResponse+0x275                               /go/src/github.com/weaveworks/cortex/vendor/google.golang.org/grpc/call.go:75
#   0x822014    github.com/weaveworks/cortex/vendor/google.golang.org/grpc.invoke+0x9c4                                 /go/src/github.com/weaveworks/cortex/vendor/google.golang.org/grpc/call.go:302
#   0x85d419    github.com/weaveworks/cortex/vendor/github.com/weaveworks/common/middleware.ClientUserHeaderInterceptor+0x109               /go/src/github.com/weaveworks/cortex/vendor/github.com/weaveworks/common/middleware/grpc_auth.go:17
#   0x877ee3    github.com/weaveworks/cortex/vendor/github.com/mwitkow/go-grpc-middleware.ChainUnaryClient.func1.1.1+0xd3               /go/src/github.com/weaveworks/cortex/vendor/github.com/mwitkow/go-grpc-middleware/chain.go:61
#   0x875df3    github.com/weaveworks/cortex/vendor/github.com/grpc-ecosystem/grpc-opentracing/go/otgrpc.OpenTracingClientInterceptor.func1+0x5e3   /go/src/github.com/weaveworks/cortex/vendor/github.com/grpc-ecosystem/grpc-opentracing/go/otgrpc/client.go:69
#   0x877ee3    github.com/weaveworks/cortex/vendor/github.com/mwitkow/go-grpc-middleware.ChainUnaryClient.func1.1.1+0xd3               /go/src/github.com/weaveworks/cortex/vendor/github.com/mwitkow/go-grpc-middleware/chain.go:61
#   0x878112    github.com/weaveworks/cortex/vendor/github.com/mwitkow/go-grpc-middleware.ChainUnaryClient.func1+0x132                  /go/src/github.com/weaveworks/cortex/vendor/github.com/mwitkow/go-grpc-middleware/chain.go:68
#   0x82146c    github.com/weaveworks/cortex/vendor/google.golang.org/grpc.(*ClientConn).Invoke+0xdc                            /go/src/github.com/weaveworks/cortex/vendor/google.golang.org/grpc/call.go:149
#   0x821620    github.com/weaveworks/cortex/vendor/google.golang.org/grpc.Invoke+0xc0                                  /go/src/github.com/weaveworks/cortex/vendor/google.golang.org/grpc/call.go:159
#   0xa8d321    github.com/weaveworks/cortex/pkg/ingester/client.(*ingesterClient).Push+0xd1                                /go/src/github.com/weaveworks/cortex/pkg/ingester/client/cortex.pb.go:1587
#   0xaa3948    github.com/weaveworks/cortex/pkg/ingester/client.(*closableIngesterClient).Push+0x88                            <autogenerated>:1
#   0xbe82ef    github.com/weaveworks/cortex/pkg/distributor.(*Distributor).sendSamplesErr.func1+0xaf                           /go/src/github.com/weaveworks/cortex/pkg/distributor/distributor.go:424
#   0x8dd15c    github.com/weaveworks/cortex/vendor/github.com/weaveworks/common/instrument.CollectedRequest+0x1dc                  /go/src/github.com/weaveworks/cortex/vendor/github.com/weaveworks/common/instrument/instrument.go:143
#   0x8dd505    github.com/weaveworks/cortex/vendor/github.com/weaveworks/common/instrument.TimeRequestHistogram+0xd5                   /go/src/github.com/weaveworks/cortex/vendor/github.com/weaveworks/common/instrument/instrument.go:169
#   0xbe47df    github.com/weaveworks/cortex/pkg/distributor.(*Distributor).sendSamplesErr+0x39f                            /go/src/github.com/weaveworks/cortex/pkg/distributor/distributor.go:423
#   0xbe42cf    github.com/weaveworks/cortex/pkg/distributor.(*Distributor).sendSamples+0x7f                                /go/src/github.com/weaveworks/cortex/pkg/distributor/distributor.go:377
#   0xbe8226    github.com/weaveworks/cortex/pkg/distributor.(*Distributor).Push.func2+0x76                             /go/src/github.com/weaveworks/cortex/pkg/distributor/distributor.go:352

There is no timeout on the context used to call Push() in the ruler.

Looks like the idea was to supply grpc.WithTimeout(), but that's a timeout on the dial, not on each individual call. Also it is documented to do nothing unless you also supply WithBlock, and we don't.

@bboreham bboreham changed the title No timeout on ingester calls Tens of thousands of gRPC goroutines between ruler and ingester Jan 30, 2018
@bboreham
Copy link
Contributor Author

bboreham commented Jan 30, 2018

Dump of the server side of these calls (on another occasion):

goroutine profile: total 84191
83999 @ 0x42e1cc 0x43e438 0x7c8e2a 0x7debfa 0x812267 0x813463 0x816dc8 0x81e86f 0x45d411
#	0x7c8e29	github.com/weaveworks/cortex/vendor/google.golang.org/grpc/transport.(*quotaPool).get+0x279	/go/src/github.com/weaveworks/cortex/vendor/google.golang.org/grpc/transport/control.go:191
#	0x7debf9	github.com/weaveworks/cortex/vendor/google.golang.org/grpc/transport.(*http2Server).Write+0x429	/go/src/github.com/weaveworks/cortex/vendor/google.golang.org/grpc/transport/http2_server.go:885
#	0x812266	github.com/weaveworks/cortex/vendor/google.golang.org/grpc.(*Server).sendResponse+0x276		/go/src/github.com/weaveworks/cortex/vendor/google.golang.org/grpc/server.go:770
#	0x813462	github.com/weaveworks/cortex/vendor/google.golang.org/grpc.(*Server).processUnaryRPC+0xfc2	/go/src/github.com/weaveworks/cortex/vendor/google.golang.org/grpc/server.go:946
#	0x816dc7	github.com/weaveworks/cortex/vendor/google.golang.org/grpc.(*Server).handleStream+0x1527	/go/src/github.com/weaveworks/cortex/vendor/google.golang.org/grpc/server.go:1143
#	0x81e86e	github.com/weaveworks/cortex/vendor/google.golang.org/grpc.(*Server).serveStreams.func1.1+0x9e	/go/src/github.com/weaveworks/cortex/vendor/google.golang.org/grpc/server.go:638

Reference grpc/grpc-go#1685 (comment)

@cboggs
Copy link
Contributor

cboggs commented Mar 4, 2018

Where we used to see this on a 24-36(-ish) hour cadence, we haven't seen it at all since the referenced PRs were merged. We still have some other struggles, per #702, but this particular behavior appears to have reared it's head lately.

Is this one close-able, or are there still instances of it cropping up elsewhere?

@bboreham
Copy link
Contributor Author

bboreham commented Mar 4, 2018

Yeah, I think the timeout would have eliminated this one, though perhaps triggered other symptoms.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants